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Abstract. We prove the inverse conjecture for the Cowers U sJ, ' 1 [N]-norra 
for all s ^ 1; this is new for s > 4. More precisely, we establish that if 
/ : [N] — > [—1, 1] is a function with [jv] ^ ^ then there is a bounded- 

complexity s-step nilsequence F(g(n)F) which correlates with /, where the 
bounds on the complexity and correlation depend only on s and 8. From 
previous results, this conjecture implies the Hardy-Littlewood prime tuples 
conjecture for any linear system of finite complexity. 
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1. Introduction 

The purpose of this paper is to establish the general case of a conjecture named 
the Inverse Conjecture for the Cowers norms by the first two authors in [531 Conjec- 
ture 8.3] . If N is a (typically large) positive integer then we write [N] := {1, . . . , N}. 
For each integer s ^ 1 the inverse conjecture GI(s), whose statement we recall 
shortly, describes the structure of 1-bounded functions / : [N] — > C whose (s + l) st 



1991 Mathematics Subject Classification. 11B30. 

1 



2 



BEN GREEN, TERENCE TAO, AND TAMAR ZIEGLER 



Gowers norm ||/||(7 8 + 1 rJV] is large. These conjectures together with a good deal of 
motivation and background to them are discussed in [T9l [2T] [23] . The conjectures 
GI(1) and GI(2) have been known for some time, the former being a straightforward 
application of Fourier analysis, and the latter being the main result of [21] (see also 
[51] for the characteristic 2 analogue). The case GI(3) was also recently established 
by the authors in [28] . The aim of the present paper is to establish the remaining 
cases GI(s) for s ^ 3, in particular reestablishing the results in |28j . 

We begin by recalling the definition of the Gowers norms. If G is a finite abelian 
group, d ^ 1 is an integer, and / : G — > C is a function then we define 

\\f\\u«(G) := (E I ,/. 1 ,..,^ 6G A hl . . . A hd f(x)) 1/2d , (1.1) 

where Ahf is the multiplicative derivative 

A h f(x) := f(x + h)W) 

and E xe xf(x) p^y Ylxex f( x ) denotes the average of a function / : X — > C on 
a finite set X . Thus for instance we have 

\\f\\u2 (G) := (E xMJl2eG f(x)f(x + h{)f{x + h 2 )f(x + h! + h 2 )) V4 • 

One can show that U d (G) is indeed a norm on the functions / : G — > C for any 
d ^ 2, though we will not need this fact here. 

In this paper we will be concerned with functions on [N], which is not quite a 
group. To define the Gowers norms of a function / : [N] —> C, set G :— Z/iVZ 
for some integer N 2^^, define a function / : G — > C by f(x) — f(x) for 
x = 1, . . . , N and f(x) = otherwise, and set 

11/llc/^tJV] : = II / II c^ d (G?) / 1 1 1 [iV] II c^rf (G) . 
where lny] is the indicator function of [N]. It is easy to see that this definition 
is independent of the choice of N. One could take N :— 2 d N for definiteness if 
desired. 

The Inverse conjecture for the Gowers U s+1 [N]-norm, abbreviated as GI(s), 
posits an answer to the following question. 

Question 1.1. Suppose that f : [N] — > C is a function bounded in magnitude by 
1, and let S > be a positive real number. What can be said if \\f\\u 3+1 [N] ^ S? 

Note that in the extreme case 5 = 1 one can easily show that / is a phase poly- 
nomial, namely f(n) = e(P(n)) for some polynomial P of degree at most s. Further- 
more, if / correlates with a phase polynomial, that is to say if |E„ e nv]/(n)e(P(?i))| ^ 
6, then it is easy to show that ||/||[/ s + 1 [jv] ^ c (^)- It is natural to ask whether the 
converse is also true - does a large Gowers norm imply correlation with a polynomial 
phase function? Surprisingly, the answer is no, as was observed by Gowers [16] and, 
in the related context of multiple recurrence, somewhat earlier by Furstenberg and 
Weiss [T31 H3] . The work of Furstenberg- Weiss and Conze-Lesigne [TU] draws atten- 
tion to the role of homogeneous spaces G/T of nilpotent Lie groups, and subsequent 
work of Host and Kra |36j provides a link, in an ergodic-theoretic context, between 
these spaces and certain seminorms with a formal similarity to the Gowers norms 
under discussion here. Later work of Bergelson, Host and Kra [1] highlights the role 
of a class of functions arising from these spaces G/T called nilsequences. The inverse 
conjecture for the Gowers norms, first formulated precisely in [53[ §8], postulates 
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that this class of functions (which contains the polynomial phases) represents the 
full set of obstructions to having large Gowers norm. 

We now recall that precise formulation. Recall that an s-step nilmanifold is a 
manifold of the form G/T, where G is a connected, simply-connected nilpotent Lie 
group of step at most s (i.e. all s + 1-fold commutators of G are trivial), and T is 
a discrete, cocompaclQ subgroup of G. 

Conjecture 1.2 (GI(s)). Let s ^ be an integer, and let < 8 ^ 1. Then there 
exists a finite collection A4 S ^$ of s-step nilmanifolds G/T, each equipped with some 
smooth Riemannian metric (ic/r o,s well as constants C(s, S), c(s, 5) > with the 
following property. Whenever N ^ 1 and f : [N] — ¥ C is a function bounded in 
magnitude by 1 such that \\f\\u s + 1 [N] there exists a nilmanifold G/T £ A4 s ,s, 

some g G G and a function F : G/T — > C bounded in magnitude by 1 and with 
Lipschitz constant at most C(s,S) with respect to the metric d<j/Y such that 

\E ne[N] f(n)F(g n x)\^c(s,S). 

We remark that there are many equivalent ways to reformulate this conjecture. 
For instance, instead of working with a finite family M. Sj s of nilmanifolds, one could 
work with a single nilmanifold G/T = G St s/T St s, by taking the Cartesian product 
of all the nilmanifolds in the family. Other reformulations include an equivalent 
formulation using polynomial nilsequences rather than linear ones (see Conjecture 
14. 5|) and an ultralimit formulation (see Coniecture l5.3[) . One can also formulate the 
conjecture using bracket polynomials, or local polynomials; see |21j for a discussion 
of these equivalences in the s — 2 case. 

Let us briefly review the known partial results on this conjecture: 

(i) GI(0) is trivial. 

(ii) GI(1) follows from a short Fourier-analytic computation. 

(iii) GI(2) was established about five years ago in [21], building on work of 
Gowers [16]. 

(iv) GI(3) was established, quite recently, in [28] . 

(v) In the extreme case S = 1 one can easily show that f(n) = e(P(n)) for 
some polynomial P of degree at most s, and every such function is an 
s-step nilsequence by a direct construction. See, for example, [21] for the 
case s — 2. 

(vi) In the almost extremal case S ^ 1 — e s , for some e s > 0, one may see that 
/ correlates with a phase e(P(n)) by adapting arguments first used in the 
theoretical computer-science literature [T]. 

(vii) The analogue of GI(s) in ergodic theory (which, roughly speaking, cor- 
responds to the asymptotic limit N — > oo of the theory here; see [37] for 
further discussion) was formulated and established in [36] , work done inde- 
pendently of the work of Gowers (see also the earlier paper [35]). This work 
was the first place in the literature to link objects of Gowers-norm type 
(associated to functions on a measure-preserving system (X,T,)i)) with 
flows on nilmanifolds, and the subsequent paper [4] was the first work to 
underline the importance of nilsequences. The formulation of GI(s) by 
the first two authors in [23] was very strongly influenced by these works. 
For the closely related problem of analysing multiple ergodic averages, the 
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relevance of flows on nilmanifolds was earlier pointed out in [13l [HI [47] , 
building upon earlier work in [10] . See also [34] [60] for related work on 
multiple averages and nilmanifolds in ergodic theory, 
(viii) The analogue of GI(s) in hnite fields of large characteristic was established 
by ergodic-theoretic methods in [5] I56j . 

(ix) A weaker "local" version of the inverse theorem (in which correlation takes 
place on a subprogression of [N] of size ~ iV Cs ) was established by Gowers 
[17j . This paper provided a good deal of inspiration for our work here. 

(x) The converse statement to GI(s), namely that correlation with a function 
of the form n v- > F(g n x) implies that / has large U s+1 [iV]-norm, is also 
known. This was first established in [HI Proposition 12.6], following ar- 
guments of Host and Kra [36] rather closely. A rather simple proof of this 
result is given in [28] Appendix G]. 

The main result of this paper is a proof of Conjecture 11.21 

Theorem 1.3. For any s ^ 3, the inverse conjecture for the U s+1 [N]-norm, GI(s), 
is true. 

By combining this result with the previous results in [231 126] we obtain a quan- 
titative Hardy-Littlewood prime tuples conjecture for all linear systems of finite 
complexity; in particular, we now have the expected asymptotic for the number of 
primes p\ < . . . < pk ^ X in arithmetic progression, for every fixed positive integer 
k. We refer to [23] for further discussion, as we have nothing new to add here 
regarding these applications. Several further applications of the GI(s) conjectures 
are given in [HI [27] . 

2. Strategy of the proof 

The proof of Theorem 11.31 is long and complicated, but broadly speaking it 
follows the strategy laid out in previous works [THl [IZ1 (HI HE1 1SI] ■ We induct on 
s, assuming that GI(s — 1) has already been established and using this to prove 
GI(s). To explain the argument, let us first summarise the main steps taken in 
[28] in order to deduce GI(3), the inverse theorem for the £/ 4 -norm, from GI(2), 
the inverse theorem for the U 3 norm (established in [21). Once this is done we 
will explain some of the extra difficulties involved in handling the general case. For 
a more extensive (but informal) discussion of the proof strategy, see [25]. Once 
we set up some technical machinery, we will also be able to give a more detailed 
description of the strategy in [[7] 

Here, then, is an overview of the argument in [28] . 

(i) (Apply induction) If ||/||!/ 4 [iV] > 1 then, for many h, \\Ahf\\u3[N] 3> 1 and 
so Ahf correlates with a 2-step nilsequence Xh- 

(ii) (Nilcharacter decomposition) \h may be decomposed as a sum of a special 
type of nilsequence called a nilcharacter, essentially by a Fourier decom- 
position. For the sake of illustration, these 2-step nilcharacters may be 
supposed to have the form 

Xh{n) = e({a h n}/3 h n), 

although these are not quite nilcharacters due to the discontinuous nature 
of the fractional part function x <-> {x}, and in any event a general 2- 
step nilcharacter will be modeled by a linear combination of such "bracket 
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quadratic monomials" , rather than by a single such monomial (see |21j for 
further discussion). 

(iii) (Rough linearity) The fact that Ahf correlates with Xh forces Xh to behave 
weakly linearly in h. To get a feel for why this is so, suppose that |/| = 1; 
then we have the cocycle identity 

A h+k f(n) = A h f(n + k)A k f{n). 

To capture something like the same behaviour in the much weaker setting 
where Ahf correlates with Xh, we use an extraordinary argument of Gow- 
ers |16j relying on the Cauchy-Schwarz inequality. Roughly speaking, the 
information obtained is of the form 

Xh-iXha ~ Xh 3 Xh A modulo lower order terms (2.1) 

for many h\,fi2, /i3) h± with hi + hi = ha + /14. 

(iv) (Furstenberg- Weiss) An argument of Furstenberg and Weiss [14] is adapted 
in order to study (|2.1[) . The quantitative distribution theory of nilse- 
quences developed in |24j is a major input here. It is concluded that we 
may assume that the frequency fih does not actually depend on h. Note 
that this step appeared for the first time in the proof of GI(3); it did not 
feature in the proof of GI(2) in [21~| . 

(v) (Linearisation) A similar argument allows one to then assert that 

Oiht + a h2 ~ a/13 + a h 4 (mod 1) (2.2) 

for many h\,h2,h^, with h\ + h<i = /13 + /14. 

(vi) (Additive Combinatorics) By arguments from additive combinatorics re- 
lated to the Balog-Szemeredi-Gowers theorem [3J [TB] and Freiman's theo- 
rem, as well as some geometry of numbers, we may then assume that ah 
varies "bracket-linearly" in h, thus 

ah = Ji{m h } H 1" ld{mh}- (2.3) 

Up to top order, then, the nilcharacter Xh(n) can now be assumed to take 
the form e{ifj(h,n,n)), where ijj is "bracket-multilinear"; it is a sum of 
terms such as {"f{T]h}n}/3n. 

(vii) (Symmetry argument) The bracket multilinear form ip obeys an additional 
symmetry property. This is a reflection of the identity A^Afc/ = A^A^/, 
but transferring this to the much weaker setting in which we merely have 
correlation of Ahf with Xh requires another appeal to Gowers' Cauchy- 
Schwarz argument from (iii). In fact, the key point is to look at the second 
order terms in (|2.1j) . 

(viii) (Integration) Assuming this symmetry, one is able to express 

Xh(n)~Q(n + h)W(nj 

for some bracket cubic functions O, 0', which morally take the form 

O(n),0' (n) ~ e(ip(n,n,n)/3i) 

(for much the same reason that x 3 /3 is an antiderivative of x 2 ). Thus we 
morally have 

A fc /(n) ~e{n + h)W(nj 
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(ix) (Construction of a nilsequence) Any bracket cubic form like e(ijj(n,n,n)) 
"comes from" a 3-step nilmanifold; this construction is accomplished in 
[28j in a rather ad hoc manner. 

(x) From here, one can analyse lower order terms by the induction hypothesis 
GI(2). This is a relatively easy matter. 

Let us now discuss the argument of this paper in the light of each point of this 
outline. A more detailed outline is given in $7] Assume that GI(s — 1) has been 
established. 

(i) (Apply induction) If ||/||jy*+i[iV] > 1 tnen > for many h, \\khf\\u°[N] > 
1 and so Ahf correlates with an (s — l)-step nilsequence Xh- This is 
straightforward (see iJ7]). 

(ii) (Nilcharacter decomposition) Xh may be decomposed into nilcharacters; 
this is fairly straightforward as well. It is somewhat reassuring to think 
of Xh( n ) as having the form e(iph(n)), where iph( n ) is a bracket polyno- 
mial "of degree s — 1" , but we will not be working explicitly with bracket 
polynomials much in this paper, except as motivation and as a source of 
examples. One of the main challenges one is faced with during an attempt 
to prove GI(4) by a direct generalisation of our arguments from |28j is the 
fact that already bracket cubic polynomials are rather complicated to deal 
with and can take different forms such as {an}{/3n}'yn and {{an}/3n}"fn. 

Instead of objects such as e(an{/3n}), then, we will work with the rather 
more abstract notion of a symbol. This notion, which is fairly central to 
our paper, is defined and discussed in Ej6j One additional technical point 
is worth mentioning here. This is the fact that e(an{f3n}) (say) cannot 
be realised as a nilsequence F(g n T) with F continuous, and therefore the 
distributional results of [24] do not directly apply. In :28j these disconti- 
nuities could be understood quite explicitly, but here we take a different 
approach: we decompose G/T into D pieces using a smooth partition of 
unity for some D = O(l), and then work instead with the (smooth) (De- 
valued nilsequence consisting of these pieces. 

We discuss this device more fully in iJH] but we emphasise that this is a 
technical device and the reader is advised not to give this particular aspect 
of the proof too much attention. 

(iii) (Rough linearity) Xh varies roughly linearly in h; this is another fairly 
straightforward modification of the arguments of Gowers, already em- 
ployed in 28 , which is performed in fJ5J 

(iv) (Furstenberg- Weiss) This proceeds along similar lines to the corresponding 
argument in [SSJ but is, in a sense, rather easier once one has developed 
the device of C D - valued nilsequences, which allow one to remain in the 
smooth category; this is accomplised in £| 1 11 after a substantial amount of 
preparatory material in Sj9] CLOl and Appendix [D] 

(v) (Linearisation) This is also quite similar to the corresponding argument in 
[2"5] . and is performed in Mill In both of parts (iv) and (v), the "bracket 
calculus" from [28 is replaced by the more conceptual "symbol calculus" 
developed in Appendix [El 

(vi) (Additive Combinatorics) The additive combinatorial input is much the 
same as in |28j . For the convenience of the reader we sketch it in Appendix 

E 
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(vii) (Construction of a nilsequence) Our argument differs quite substantially 
from that in |28j at this point. The s-step nilobject, which is now a two- 
variable object x{h, n), is constructed before the symmetry argument and 
in a more conceptual manner. This may be compared with the rather ad 
hoc approach taken in [2 1 1 128] . where various bracket polynomials were 
merely exhibited as arising from nilsequences. We perform this construc- 
tion in SfH 

(viii) (Symmetry argument) We replace x(/i, n) with an equivalent nilcharacter 
x(h,n, . . . , n) where ^ is a nilcharacter in s variables, that is symmetric 
in the last s — 1 variables. The symmetry argument given in £j 131 shows 
that \{h, n, . . . ,n) is equivalent to x(n, h, . . . , n). Again the key idea in 
the analysis is to look at the second order terms in (j2.1jl . 

(ix) (Integration) With the symmetry in hand, we can use the calculus of multi- 
linear nilcharacters essentially express x(h, n, . . . , n) as the derivative of an 
expression which is roughly of the form x(n, . . . , n)/s; see £1131 for details. 

(x) The final step of the argument is relatively straightforward, as before; see 

m 

In our previous paper |28j it was already rather painful to keep proper track 
of such notions as "many" and "correlates with" . Here matters are even worse, 
and so to organise the above tasks it turns out to be quite convenient to first 
take an ultralimit of all objects being studied, effectively placing one in the setting 
of nonstandard analysis. This allows one to easily import results from infinitary 
mathematics, notably the theory of Lie groups and basic linear algebra, into the 
Unitary setting of functions on [N] . In fj5] and Appendix [A] we review the basic 
machinery of ultralimits that we will need here; we will not be exploiting any 
particularly advanced aspects of this framework. The reader does not really need 
to understand the ultrafilter language in order to comprehend the basic structure 
of the paper, provided that he/she is happy to deal with concepts like "dense" and 
"correlates with" in a somewhat informal way, resembling the way in which analysts 
actually talk about ideas with one another (and, in fact, analogous to the way we 
wrote this paper). It is possible to go through the paper and properly quantify all 
of these notions using appropriate parameters 5 and (many) growth functions T . 
This would have the advantage of making the paper on some level comprehensible 
to the reader with an absolute distrust of ultrafilters, and it would also remove the 
dependence on the axiom of choice and in principle provide explicit but very poor 
bounds. However it would cause the argument to be significantly longer, and the 
notation would be much bulkier. 

Our exposition will be as follows. We will begin by spending some time intro- 
ducing the ultrafilter language and then, motivated by examples, the notions of 
nilsequence, nilcharacter and symbol. Once that is done we will, in S}7l give the 
high-level argument for Theorem 11.3) this consist of detailing points (i), (ii) and 
(x) of the outline above and giving proper statements of the other main points. 

The discussion above concerning points (iii), (iv), (v) and (vi) has been simplified 
for the sake of exposition. In actual fact, these points are dealt with together by 
a kind of iterative loop, in which more and more bracket-linear structure is placed 
on the nilcharacters Xh(n) by cycling from (iii) to (vi) repeatedly. 

We remark that a quite different approach using ultrafilters to the structural 
theory of the Gowers norms is in the process of being carried out in [52j [53l [54] ; 
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this seems related to the work of Host and Kra, whereas our work ultimately derives 
from the work of Gowers. 

We also make the minor remark that our proof of GI(s) is restricted to the case 
s 3 case for minor technical reasons. In particular, we take advantage of the 
non-trivial nature of the degree s — 2 "lower order terms" in the Gowers Cauchy- 
Schwarz argument (Proposition I7.3j) in the symmetry argument step; and we will 
also observe that the various "smooth" and "periodic" error terms arising from the 
equidistribution theory in Appendix [D] are of degree 1 and thus negligible compared 
with the main terms in the analysis, which are of degree s — 1. The arguments can 
be modified to give a proof of GI(2), although this proof would basically be a 
notationally intensive repackaging of the arguments in [21] . 
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3. Basic notation 

We write N := {0, 1,2,.. .} for the natural numbers, and N+ := {1, 2, . . .} for 
the positive natural numbers. Given two integers N, M, we write [N, M] for the 
discrete interval [N,M] := {n : N ^ n ^ M}. We also make the abbreviations 
[N] :— [1,N], and , and [[N]] := [— N, N]. If x is a real number, we write x mod 1 
for the associated residue class in the unit circle T := K/Z, and write x = y mod 1 
if x and y differ by an integer. 

We will rely frequently on the following two elementary functions: the funda- 
mental character e : M — > C (or e : T — > C) defined by 

e(x) := e , 

and the signed fractional part functiot^ {} : M — > Iq, where Iq is the fundamental 
domain 

I Q := {x G R : -1/2 < x < 1/2} 

and {x} is the unique real number in Iq such that x = {x} mod 1. We will often 
rely on the identity 

e(x) = e({x}) — e(x mod 1) 

without further comment. 

For technical reasons, we will need to manipulate vector- valued complex quan- 
tities in a manner analogous to scalar complex quantities. If v — {vi)^ =1 and 
w = {uii)f =1 are vectors in C D and C^* respectively then we form the tensor prod- 
uct v (g) w G C DD by the formula 

v ® w :— (viwi, ■ ■ ■ ,VdWd') 



The signed fractional part will be slightly more convenient to work with than the unsigned 
fractional part, as it is equal to the identity near the origin. 
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and the complex conjugate v e C D by the formula 

v := (UT, . . . ,u5)- 

Similarly, if X is some set and / : X — > C D and g : X — > C" arc functions then 
we write f ® g : X ^ C DD for the function defined by (/ <?)(£) := /(x) €5 g(x), 
and similarly define / : X — > C D . 

If G = (G, +) is an additive group, k £ N, g = (<7i, ■ • ■ , <7k) S G fe , and a = 
(ai, . . . , afe) G Z fc , we define the dot product 

a-g:= cngi + . . . + a fc 5 fe . 

Given a set iZ in an additive group, define an additive quadruple in H to be a 
quadruple {hi,h2,h^,hi) G iZ with h\ + h,2 = h$ + The number of additive 
quadruples in ZZ is known as the additive energy of if and is denoted E(H). 

A map : ZZ — > G from if to another additive group G is said to be a Freiman 
homomorphism if it preserves additive quadruples, i.e. if 4>{h\) + 4>{h2) = <j){hz) + 
<p(hi) for all additive quadruples (hi,h2,hs,hi) in H. 

Given a multi-index d = (di, . . . , (4) e N fe , we write |d| := d\ + . . . + dk- 

We now briefly review and clarify some standard notation from group theory. 

When we do not assume a group G to be abelian, we will always write G multi- 
plicatively: G = (G, •). However, when dealing with abelian groups, we reserve the 
right to use additive notation instead. 

We view an n-tuple (ai ,...,«„) of labels as a finite ordered set with the ordering 
ai < . . . < a n . If A = (oi, . . . , a n ) is a finite ordered set and (g a )a£A are a collection 
of group elements in a multiplicative group G, we define the ordered products 

n 1 

17 9a-=g ai ---g an , 9% '■= 9i ■ ■ ■ 9n and \\_9i ■= g n ■ ■ ■ 9\ 

a£A i—1 i—n 

for any n ^ 0, with the convention that the empty product is the identity. We 
extend this notation to infinite products under the assumption that all but finitely 
many of the factors are equal to the identity. 

Given a subset A of a group G, we let (A) denote the subgroup of G generated 
by A. Given a family (Hi) iG j of subgroups of G, we write Vie/ Hi f° r the smallest 
subgroup of G that contains all of the Hi. 

Given two elements g, h of a multiplicative group G, we define the commutator 

[9,h] := g~ x hT x gh. 

We write H ^ G to denote the statement that H is a subgroup of G. If H,K ^ G, 
we let [JT, if] be the subgroup generated by the commutators [h, k] with h € H and 
fc a, thus [H,K] = ({[h,k] :h£H,ke K}). 

If r ^ 1 is an integer and gi,...,g r £ G, we define an (r — l)-fold iterated 
commutator of g\ , . . . , g r inductively by declaring g\ to be the only 0-fold iterated 
commutator of gi, and for r > 1 defining an (r — l)-fold iterated commutator to be 
any expression of the form [w, w'], where w and w' are (s — l)-fold and (s' — l)-fold 
commutators of g^ , . . . , gi s and , . . . , g^ respectively, where s,s' ^ 1 are such 
that s + s' = r, and . . . , i s }U{i[, . . . , i' s ,} = {1, . . . , r} is a partition of {1, ... , r} 
into two classes. Thus for instance [[ff3,ffi], [52,54]] and [52, [51, [53,34]]] are 3-fold 
iterated commutators of gi, ... ,34. 

The following lemma will be useful for computing commutator groups. 



10 



BEN GREEN, TERENCE TAO, AND TAMAR ZIEGLER 



Lemma 3.1. Let H = (A),K = (B) be normal subgroups of a nilpotent group G 
that are generated by sets A C H , B C K respectively. Then [H, K] is normal, 
and is also the subgroup generated by the i + j — 1 -fold iterated commutators of 
a±, . . . , aj, bi, . . . , bj with a±, . . . , a; € A, b\,...,bj 6 B and i,j 1. 

Proof. The normality of [H, K] is follows from the identity 

g[H,K\g- 1 = \gHg- 1 ,gKg- 1 ]. 

It is then clear that [H, K] contains the group generated by the iterated commuta- 
tors of elements in A, B that involve at least one element from each. The converse 
follows inductively using the identities 

[x,y] = [y,x]~ 1 , [xy,z] = [x,z][[x,z],y][y ) z] and [x,y~ 1 ] = [y,x][[y,x],y^ 1 ]. 

(3.1) 

This concludes the proof. □ 
As a corollary of the above lemma, we have the distributive law 

\jHi,\jKi = V/ l H ^ K o\ 
iei jeJ ieijeJ 

whenever (Hi)i e i, (Kj)j e j are families of normal subgroups of a nilpotent group 
G. 

If H < G is a normal subgroup of G, and g £ G, we use g mod H to denote the 
coset representative gH of g in G/H. For instance, g = g' mod H if gH = g'H. 

At various stages in the paper we will need the (discrete) Baker-Campbell- 
Hausdorff formula in the following weak form: 

<?r52 2 = 52 2 3'i li n3a P " (ni '" 2) (3-2) 

a 

for all <7i,<72 in a nilpotent group G and all integers rii,n2, where g a ranges over 
all iterated commutators of g\ , gi that involve at least one copy of each (note from 
nilpotency that there are only finitely many non-trivial g a ), with the a ordered in 
some arbitrary fashion, and P a : Z x Z — > Z are polynomials. Furthermore, if g a 
involves d\ copies of g\ and c?2 copies of <?2, then P a has degree at most d\ in the 
ni variable and di in the ni variable. 

Let G be a connected, simply connected, nilpotent Lie group (or nilpotent Lie 
group for short). Then wc denote the Lie algebra of G as logG. As is well known 
(see e.g. [7]), the exponential map exp : logG — > G is a homeomorphism, inverted 
by the logarithm map log : G —> log G, and we can then define the exponentiation 
operation g l for any g G G and t € K by the formula 

g l := exp(ilogff). 

There is a continuous version of the Baker-Campbell-Hausdorff formula: 

9^ 2 = <^n^ (tl ' t2) (3-3) 



for all ii,i2 el and 51,32 6 G, where P a are the polynomials occurring in (|3.2|) 
We also observe the variant formulae 
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for some polynomials Q a and all t <E K, g±, gi € G, and 

exp(*i log 5 i + i 2 log< ?2 ) - gi'gl 2 ]Jg*° (tl < t2) 

a 

for some further polynomials R a and all ti,t2 € K, 171,52 € G. We refer to all of 
these formulae collectively as the Baker- Campbell-Hausdorff formula. 

If A is a subset of a nilpotent Lie group G, we let (A}r be the smallest connected 
Lie subgroup of G containing A, or more explicitly 

(A) R := {{a* :a£A;t<ER}}. 

Equivalently, log(A)u is the Lie algebra generated by log A. 

A lattice of a nilpotent Lie group G is a discrete cocompact subgroup T of G. 
Thus for instance, we see from (|3.2[) that for any finite set A in G, (A) will be a 
cocompact subgroup of (A)r, and will thus be a lattice if (A) is discrete. 

A connected Lie subgroup H of G is said to be rational with respect to T if 
r n H is cocompact in if. For instance, if G = M 2 , T is the standard lattice Z 2 , 
and ael, then the connected Lie subgroup if := {(x,ax) : x 6 M} is rational if 
and only if a is rational. 

Further notation. Here is a list of further notation used in the paper for 



reference, together with the place in the paper where each piece is defined and 
discussed. 

poly(i?N — > Gn) polynomial maps from one filtered group Hjq to Gn 16.181 

poly(Zpj — > Gn) polynomial maps with the degree filtration 16.181 

poly(Z^ fc — > Gpjfc) polynomial maps with the multidegree filtration 16.181 

poly(ZnR — > Gdr) polynomial maps with the degree-rank filtration 16.181 

L°°(n -» C D ) bounded limit functions to *C d (jA~Tj) 

L°°(n -4 CT) bounded limit functions (also L°°(0)) (jA~T]) 

Lip(* (G/r) — » C ) bd'd limit functions with bounded Lipschitz constant 15.11 

Nil d ([iV]) nilsequences of degree < d on [AT] [5J2] 

Nil cJ (f2) nilsequences of degree C J 16.191 

S d ([iV]) space of degree d nilcharacters on [N] 16.11 

"Muitf multidegree nilcharacters 16.191 

Sp^^il) degree-rank nilcharacters 16.191 

Symb d ([7V]) equiv. classes of degree d nicharacters in S d ([iV]) 16.61 

Symb^^y'^- 1 (fl) equiv. classes of multidegree nicharacters 16.221 

Symb^f^^^) equiv. classes of degree-rank nicharacters 16.221 

G D , G ,D ^( S - 1 .'"*) universal nilpotent Lie group of degree-rank (s — 1, r*) 19.11 

Horizi(G) «'th horizontal space of G 19.61 

Taylor^ (g) i'th horizontal Taylor coefficient of a polynomial map 19.61 

(D, 77, F) total frequency representation of a nilcharacter 19.111 



4. The polynomial formulation of GI(s) 

The inverse conjecture GI(s), Conjecture 11.21 has been formulated using linear 
nilsequences F(g n xT). This is largely for compatibility with the earlier paper [33] of 
the first two authors on linear equations in primes, where this form of the conjecture 
was stated in precisely this form as Conjecture 8.3. Subsequently, however, it was 
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discovered that it is more natural to deal with a somewhat more general class 
of object called a polynomial nilsequence F(g(n)Y). This is particularly so when 
it comes to discussing the distributional properties of nilsequences, as was done 
in |24j . Thus, we shall now recast the inverse conjecture in terms of polynomial 
nilsequences, which is the formulation we will work with throughout the rest of the 
paper. 

Let us first recall the definition of a polynomial nilsequence of degree d. 

Definition 4.1 (Polynomial nilsequence). Let G be a (connected, simply-connected) 
nilpotent Lie group. By a filtration Gn = (Gj)ieN of degree ^ d we mean a nested 
sequence G 3 Gq 3 G\ 3 G% 2 • • • 2 Gd+i = {id} with the property that 
[Gi,Gj] C Gi+j for all i,j ^ 0, adopting the convention that Gi — {id} for all 
i > d. By a polynomial sequence adapted to Gn we mean a map g : Z — > G such that 
dhi ■ ■ ■ dhi9 £ Gi for alH and h% , . . . , hi G Z, where dhg(n) := g(n + h)g(n) . 
Write poly(ZN — > Gn) for the collection of all such polynomial sequences. 

Let r < G be a lattice in G (i.e. a discrete and cocompact subgroup), so that 
the quotient G/Y is a nilmanifold, and assume that each of the Gi are rational 
subgroups (i.e. Yi := Y n Gi is a cocompact subgroup of Gi). We refer to the 
pair G/Y = (G/L,Gn) as a filtered nilmanifold. A polynomial orbit O : Z — > G/L 
is a sequence of the form (D(n) := g(n)Y, where g G poly(ZN — > Gn); we let 
poly(Zrj — > (G/L)n) denote the space of all such polynomial orbits. If F : G/L — > C 
is a 1-bounded, Lipschitz function then the sequence FoO — {F{g(n)Y)) n£ j J is called 
a polynomial nilsequence of degree d. 

The subscripts N will become more relevant later in this paper, when we start 
filtering nilpotent groups and nilmanifolds by other index sets / than the natural 
numbers N. Note that we do not require Go or G\ to equal G; this freedom will be 
convenient for some minor technical reasons, although ultimately it will not enlarge 
the space of polynomial nilsequences. 

Let us give the basic examples of nilsequences and polynomials: 

Example 4.2 (Linear nilsequences are polynomial nilsequences). Let G be a d-step 
nilpotent Lie group, and let L be a lattice of G. Then, as is well known (see e.g. 
[7]), the lower central series filtration defined by Go = G\ := G, G2 := [G, Gi], 
G3 := [G, G2], . . . , Gd+i ■= [G, Gd] = {id} is a filtration on G. Using the Baker- 
Campbell-Hausdorff formula (13.31) it is not difficult to show that the lower central 
series filtration is rational with respect to L, so the nilmanifold G/L becomes a 
filtered nilmanifold. If g(n) := g™go for some go,gi £ G, then d} ll g{n) = g^ 1 
and ■ ■ ■ dh 1 g(n) = id for i ^ 2: therefore g is a polynomial sequence, and 
so every linear orbit n t— > g n x with g G G and x G G/L is a polynomial orbit 
also. As a consequence we see that every d-step linear nilsequence n t-> F(g n x) is 
automatically a polynomial nilsequence of degree ^ d. 

Example 4.3 (Polynomial phases are polynomial nilsequences). Let d be an 
integer. Then we can give the unit circle T the structure of a degree ^ d filtered 
nilmanifold by setting G := M and L := Z, with Gi := K for i ^ d and Gi := {0} 
for i > d. This is clearly a filtered nilmanifold. If ao, . . . ,ctd are real numbers, 
then the polynomial P(n) := ao + ■ • • + a-dn d is then polynomial with respect to 
this filtration, with n t— > P(n) mod 1 being a polynomial orbit in T. Thus, for 
any Lipschitz function F : T — > C, the sequence n 1— > F(P(n)) is a polynomial 
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nilsequence of degree ^ d; in particular, the polynomial phase n i-> e(P(n)) is a 
polynomial nilsequence. 

Example 4.4 (Combinations of monomials are polynomials). By Corollary IB. 41 
we see that if G = (G, (Gi)^) is a filtered group of degree $5 d, then any sequence 
of the form 

k 

nPj (n) 

3 = 1 

in which gj £ for some dj £ N, and Pj : Z — > R is a polynomial of degree ^ dj , 
will be a polynomial map. Thus for instance 

" ^ 9d ■■■92 9i 9o 

is a polynomial map whenever g_,- £ Gj for j = 0, .. .,d. In fact, all polynomial 
maps can be expressed in such a fashion via a Taylor expansion; see Lemma IB. 91 

We will give several further examples and properties of polynomial maps and 
polynomial nilsequences in §6\ 

As a consequence of Example 14.21 the following variant of the inverse conjecture 
GI(s) is ostensibly weaker than that stated in the introduction. 

Conjecture 4.5 (GI(s), polynomial formulation). Let s ^ be an integer, and 
let < S ^ 1. Then there exists a finite collection A4 s ^s of filtered nilmanifolds 
G/T = (G/T, Gn), each equipped with some smooth Riemannian metric dc/r as 
well as constants C(s,5),c(s,8) > with the following property. Whenever N 1 
and f : [N] — > C is a function bounded in magnitude by 1 such that \\f\\u s + 1 [N] 
there exists a filtered nilmanifold G/T £ Ai s .S, some g £ poly(Zi>j — > Gn) and a 
function F : G/T — > C bounded in magnitude by 1 and with Lipschitz constant at 
most C(s,S) with respect to the metric dc/r such that 

\K ne[N] f(n)F(g(n)T)\^c(s,S). 

It turns out that this conjecture is actually equivalent to Conjecture 11.21 we 
shall prove this equivalence in Appendix [Cj We remark that, though it might seem 
odd to put a non-trivial part of the proof of our main theorem in an appendix, 
we would rather encourage the reader to regard the proof of Conjecture 14.51 as our 
main theorem. The rationale behind this is that everything that is done with linear 
nilsequences F{g n xT) in [53] could have been done equally well, and perhaps more 
naturally, with polynomial nilsequences F(g(n)T). Further remarks along these 
lines were made in the introduction to our earlier paper |28j , where the polynomial 
formulation was emphasised from the outset. Here, however, we have felt a sense 
of duty to formally complete the programme outlined in [23] . 

Henceforth we shall refer simply to a nilsequence, rather than a polynomial 
nilsequence. 

In [JB] we will need to generalise the notion of a (polynomial) nilsequence by 
allowing more exotic nitrations G\ on the group G, indexed by more complicated 
index sets I than the natural numbers N. In particular, we shall introduce the 
multidegree filtration, which allows us to define nilsequences of several variables, as 
well as the degree-rank filtration which provides a finer classification of polynomial 
sequences than merely the degree. We will discuss these using examples, and then 
develop a more unified theory that contains all three. 
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5. Taking ultralimits 

The inverse conjecture, Conjecture 14. 5[ is a purely finitary statement, involving 
functions on a finite set [N] = {1 , . . . , N} of integers. As such, it is natural to look 
for proofs of this conjecture which are also purely finitary, and much of the previous 
literature on these types of problems is indeed of this nature. 

However there is a very notable exception, namely the portion of the litera- 
ture that exploits the Furstenberg correspondence principle between combinatorial 
problems and ergodic theory. See |12) for the original application to Szemeredi's 
theorem, or [5 6) for a more recent application to Gowers norms over finite fields. 
Here we use a somewhat different type of limit object, namely an ultralimit. We 
are certainly not the first to employ ultralimits (a.k.a. nonstandard analysis) in 
additive number theory; see for example {40.. 

The ultralimit formalism allows us to convert a "finitary" or "standard" state- 
ment such as Conjecture 14.51 into an equivalent statement concerning limit objects, 
constructed as ultralimits of standard objects. This procedure is closely related to 
the use of the transfer principle in nonstandard analysis, but we have elected to 
eschew the language of nonstandard analysis in order to reduce confusion, instead 
focusing on the machinery of ultralimits. 

Here is a brief and somewhat vague list of the advantages of using the ultralimit 
approach. 

• Pigeonholing arguments are straightforward (due to the fact that a limit 
function taking finitely many values is constant); 

• Book-keeping of constants: one can talk rigorously about such concepts as 
"bounded" functions without a need to quantify the bounds; 

• One may make rigorous sense of such statements as "the function / : 
[N] — > C and the function g : [N] — s> C are equivalent modulo degree s 
nilscquences" . 

• In the infinitary context one may easily perform rank reduction arguments 
in which one seeks to find the "minimal bounded-complexity" representa- 
tion of a given system. 

There are also some drawbacks of the approach: 

• It becomes quite difficult to extract any quantitative bounds from our 
results, in particular we do not give explicit bounds on the constant c(s, S) 
or on the complexity of the nilsequence in Coniecture lf .2l or Coniecture l4.5l 
It is in principle possible to expand the ultralimit proof into a standard 
proof, but the bounds are quite poor (of Ackermann type) due to the 
repeated use of "rank reduction arguments" and other highly iterative 
schemes that arise in the conversion of ultralimit arguments to standard 
ones. For further discussion of the relation of ultralimit analysis to finitary 
analysis see [55l §1.3, §1.5]. 

• The language of ultrafilters adds one more layer of notational complexity 
to an already notationally-intensive paper; however, there are gains to be 
made elsewhere, most notably in eliminating many quantitative constants 
(e.g. e, N) and growth functions (e.g. J 7 ). 

Limit formulation of GI(s). The basic notation and theory of ultralimits 
are reviewed in Appendix [A] We now use this formalism to convert the inverse 
conjecture, GI(s), into an equivalent statement formulated in the framework of 
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ultralimits. We first consider a limit version of the concept of a Lipschitz function 
on a nilmanifold. For technical reasons we will need to consider vector-valued 
functions, taking values in C D or C rather than C or C. 

Definition 5.1 (Lipschitz functions). Let G/T be a standard nilmanifold, and let 
D e N+ be standard. 

• We let Lip(G/F — > C D ) be the space of standard Lipschitz functions F : 
G/T — > C D . (Here we endow the compact manifold G/T with a smooth 
metric in an arbitrary fashion; the exact choice of metric is not relevant.) 

• We let Lip(*(G/F) — >• C ) be the space of bounded limit functions F : 

*(G/T) — > C whose Lipschitz constant is bounded (or equivalently, F 
is an ultralimit of uniformly bounded functions F n : G/T — > C D with 
uniformly bounded Lipschitz constant). 

. We let Lip(* (G/T) -> S 2D - 1 ) be the functions in Lip(* {G/T) -> C^) that 
take values in the (limit) complex sphere 

S^ 1 := {z e C° : \z\ = 1}. 

• We write 

LipHG/T) ->€"):= |J LipC(G/F)^C D ) 

and 

Lip(*(G/r) := |J Lip(*(G/r) ->S35=T). 

We will often abbreviate these spaces as Lip(G/F) or Lip(*(G/r)) when the range 
of the functions involved is not relevant to the discussion. 

Remark. As G/T is compact, we see from the Arzela-Ascoli theorem that 
Lip(G/F ->• C D ) is locally compact in the L°°{G/T -> C D ) topology. As a con- 
sequence, if we embed Lip(G/F -> C D ) into Lip(* (G/T) -> C ) in the obvious 
manner, then the former is a dense subspace of the latter in the (standard) uniform 
topology, in the sense that for every F £ Lip(*(G/F) — > C ) and every standard 
£ > there exists F' € Lip(G/F -> C D ) such that \F(x) - F'(x)\ < e for all 
x € *{G/T). 

Remark. Observe that the spaces Lip(*(G/F) C°) and Lip(*(G/F) -)• C") 
are vector spaces over C. The spaces Lip(*(G/r) -»• C ) and Lip(*(G/F) 5") 
are also closed under tensor product (as defined in Ej3j . All the spaces defined in 
Definition 15.11 are closed under complex conjugation. 

Using the above notion, we can define the limit version of a (polynomial) nilse- 
quence. 

Definition 5.2 (Nilsequence) . Let s ^ be standard. A nilsequence of degree 
^ s is any limit function ip : *Z — > *C of the form ip(ri) :— F(g(n)T), where 
G/T = (G/T, Gn) is a standard filtered nilmanifold of degree ^ s, g : *Z — >• *G is a 
limit polynomial sequence (i.e. an ultralimit of polynomial sequences g n : 1 — > G), 
and F e Lip(*(G/F) | -> C). 

Given any limit subset f2 of *Z, we denote the space of degree d nilsequences, 
restricted to ft, as Nil d (0) = Ni\ d (il -> C"); this is a subset of L 00 ^ -> (T). We 
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write Nil (f2 — > C ) for the nilsequences that take values in C ; this is a subspace 
(over C) of L°°(Q -)• C D ). We make the technical remark that Nil d (fi) is a a- 
limit set, since one can express this space as the union, over all standard M and 
dimensions D, of the nilsequences taking values in C arising from a nilmanifold of 
"complexity" M and a Lipschitz function of constant at most M, where one defines 
the complexity of a nilmanifold in some suitable fashion. In particular, the limit 
selection lemma in Corollary I A. 121 can be applied to this set. 

We also define the Gowers uniformity norm ||/||i7 a + 1 [iv] of an ultralimit / = 
lim n _,.p / n of standard functions /„ : [N n ] — > C in the usual limit fashion 

II/IIlt-'+Mjv] : = lim ll/n||c/=+i[jv n ]- 

If / is vector- valued instead of scalar valued, say / = (f\, . . . , fd), then we define 
the uniformity norm by the formula 

d 

\\f\\u°+^[N] '■= WfiWu'+^N]) ■ 

i=l 

(The exponent 2 S+1 is not important here, but has some very slight aesthetic ad- 
vantages over other equivalent formulations of the vector- valued norm.) 
The ultralimit formulation of GI(s) can then be given as follows: 

Conjecture 5.3 (Ultralimit formulation of GI(s)). Let s ^ be standard and 
N 1 be a limit natural number. Suppose that f € L°°([N] — > C) is such that 
ll/llf7 s + 1 [Af] ^ !• Then f correlates with a degree ^ s nilsequence on [N]. 

See Definition I A . 71 for the definition of correlation in this context. 
We now show why, for any fixed standard s, Conjecture 15.31 is equivalent to its 
more traditional counterpart, Conjecture 14.51 

Proof of Conjecture \5.3\ assuming Conjecture ^. 5\ Let / be as in Conjecture 15.31 
We may normalise the bounded function / to be bounded by 1 in magnitude 
throughout. By hypothesis, there exists a standard S > such that ||/||[/»+i[j\n S. 
Writing N and / as the ultralimits of N n , f n respectively for some /„ : [N n ] — > C 
bounded in magnitude by 1, and applying Conjecture 14.51 we conclude that for n 
sufficiently close to p, we have the correlation bound 

l E « n G[A'„]/n("-n)-F 1 n(.9n(nn)r n )| ^ c(s,S) > 

where G„/r n , g n , x a , F n are as in Conjecture 11.21 Writing G/T,g,x,F for the 
ultralimits of G n /T n , g n ,x n , F n respectively, we thus have 

K & [N]f(n)F(g(n)*T)\ » 1. 

By the pigeonhole principle (cf. Appendix [A")) , we see that G/T is a standard degree 
^ s nilmanifold, while g : *Z — !• *G and x € G/T remain limit objects. The limit 
function F lies in Lip(*(G/T) — > C) by construction, and the claim follows. □ 

Proof of Conjecture \4-5\ assuming Conjecture \5.3[ Observe (from the theory of 
Mal'cev bases 48 ) that there are only countably many degree ^ s nilmanifolds 
G/T up to isomorphism, which we may enumerate as G n /T n . We endow each of 
these nilmanifolds arbitrarily with some smooth Riemannian metric ^G„/r n ■ 

Suppose for contradiction that Conjecture 14.51 failed. Carefully negating all the 
quantifiers, we may thus find a 6 > 0, a sequence N n of standard integers, and a 
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function / n : [N a ] — > C bounded in magnitude by 1 with ||/n||c/ s + 1 [JV] ^ such 
that 

\^n n e[N n ]Un n )F(g(n n )r n ,))\ *S 1/n (5.1) 
whenever n' ^ n, g £ poly(ZN — > and F : G n > /T n i — > C is bounded in 

magnitude by 1 and has a Lipschitz constant of at most n with respect to rfG„/rv 
On the other hand, viewing / as a bounded limit function, we can apply Con- 
jecture [531 and conclude that there exists a standard filtered nilmanifold G/T with 
some smooth Riemannian metric dc/v-, a limit polynomial g : *Z — > *G, and some 
ultralimit F £ Lip(*(G/r) -4 C) of functions F n : G/T -> C with uniformly 
bounded Lipschitz norm, such that 

\E ne[N] f(n)F(g(n)*T)\ > e 

for some standard e > 0. 

By construction, G/T is isomorphic to G no /T no for some no, so we may as- 
sume without loss of generality that G/T = G no /r no ; since all smooth Riemannian 
metrics on a compact manifold are equivalent, we can also assume that cIq/y = 
^G„ /r„ - We may also normalise F to be bounded in magnitude by 1. But this 
contradicts (15. 1[) for n sufficiently large, and the claim follows. 

Thus, to establish Theorem II. 3) it will suffice to establish Conjecture 15.31 for 
s 3. This is the objective of the remainder of the paper. 

Remark. We transformed the finitary linear inverse conjecture, Conjecture 11.21 
into a nonstandard polynomial formulation, Conjecture 15.31 via the finitary poly- 
nomial inverse conjecture, Conjecture 14.51 One can also swap the order of these 
equivalences, transforming the finitary linear inverse conjecture into a nonstandard 
linear formulation by arguing as above, and then transforming the latter into a 
nonstandard polynomial formulation by using Proposition IC.21 Of course the two 
arguments are essentially equivalent. 

Coniecture 15.31 is trivial when N is bounded, since every function in L°°[N] is 
then a nilsequence of degree at most s. For the remainder of the paper we shall 
thus adopt the convention that N denotes a fixed unbounded limit integer. 

To conclude this section we reformulate Conjecture 14.51 by introducing the im- 
portant notion of bias. 

Definition 5.4 (Bias and correlation). Let be a limit finite subset of Z, and let 
d £N. We say that /, g £ L°°(f2 -4 C") d-correlate if we have 

f(n) <&g{ri) ® V(n)| > 1 

for some degree d nilsequence ip £ Nh d (f2 — s> C ). We say that / is d-biased if / 
ci-correlates with the constant function 1, and d-unbiased otherwise. 

With this definition, Coniecture l5.3l can be reformulated in the following manner. 

Conjecture 5.5 (Limit formulation of GI(s), II). Let s be standard. Suppose 
that f £ L°°([N] — > C) is such that \\f\\u s + 1 [N] ^ !■ Then f is s-biased. 

From previous literature, we see that Conjecture [53] has already been proven for 
s ^ 2; we need to establish it for all s 3. We also make the basic remark that 
while the conjecture is only phrased for scalar- valued functions / £ L°°([iV] — >• C), 
it automatically generalises to vector- valued functions / € L°°([N] — > C ), since 
if a vector- valued function / has large U S+1 [N] norm, then so does one of its 
components. 
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Finally we remark that the converse implication is known. 

Proposition 5.6 (Converse GI(s), ultralimit formulation). Let s J? be standard. 
Suppose that / € L°°([N] — > C) is ^ s-biased. Then \\f\\u s + 1 [N] ^ 1- 

Proof. This follows from [2TJ Proposition 12.6], [23j §11], or [28l Proposition 1.4], 
transferred to the ultralimit setting in the usual fashion. □ 

6. NlLCHARACTERS AND SYMBOLS IN ONE AND SEVERAL VARIABLES 

Coniecture l5 . 31 asserts that a function in -L°°([./V] — > C) on an unbounded interval 
[TV] correlates with a degree ^ s nilsequence. For inductive reasons, it is useful 
to observe that this conclusion implies a strengthened version if itself, in which 
/ correlates with a special type of degree ^ s nilsequence, namely a degree s 
nilcharacter. A nilcharacter is a special type of nilsequence and should be thought 
of, very roughly speaking, as a generalisation of characters e(cm) in the degree 1 
setting, or objects such as e(an{(3n}) in the degree 2 setting; these were crucial in 
our paper on GI(3) [28] , although the notation there was slightly different in some 
minor ways. See |29j for further informal discussion of nilcharacters. 

In the s = 1 case, a nilcharacter is essentially (ignoring constants) the same 
thing as a linear phase function n i— > e(£n), and the frequency £ can be viewed 
as living in the Pontryagin dual of *Z (or, in some sense, of [TV], even though the 
latter set is not quite a locally compact abelian group). It will turn out that more 
generally, a degree s nilcharacter will have a "symbol" (analogous to the frequency 
£) that takes values in a "higher order Pontryagin dual" Symb s ([7V]) of [TV]; this 
symbol can be interpreted as the "top order term" of a nilcharacter, for instance 
the symbol of the degree 3 nilcharacter n h-> e(an 3 + fin 2 + 771 + S) is basicalljQ 
a. This higher order dual obeys a number of pleasant algebraic properties, and the 
primary purpose of this section is to develop those properties. 

There are various additional complications to be taken into account: 

• We will require multidimensional generalisations of these concepts (think 
of the two-dimensional sequence {ni,n-z) <-> e(cmi{/3ri2})) together with 
appropriate notions of multidegree in order to make sense of "top-order" 
and "lower-order terms" ; 

• We will be dealing with C^-valued (or, rather, S' 2£,_1 -valued) nilsequences 
rather than merely scalar ones. This is so that we may continue to work 
in the smooth category, as discussed in the introduction; 

• The language of ultrafilters will be used. 

Our main focus here will be on the first of these points. The second is largely 
a technicality, whilst the third is actually helpful in that the notion of symbol (for 
example) is rather clean and does not require discussion of complexity bounds. 

Motivation and one-dimensional definitions. We now give the definitions 
of a (one-dimensional) nilcharacter and its symbol, and give a few examples. How- 
ever, we will hold off for now on actually proving too much about these concepts, 
because we will shortly need to generalise these notions to a more abstract setting 
in which one also allows multidimensional nilcharacters, and nilcharacters that are 
atuned not just to a specific degree, but also to a specific "rank" inside that degree. 

^This is an oversimplification; it would be more accurate to say that the symbol is given by a 
modulo *Z + Q + 0(N~ 3 ). 
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Definition 6.1 (Nilcharacter). Let d ^ be a standard integer. A nilcharacter \ 
of degree d on [TV] is a nilsequence x( n ) = F{0{n)) = F(g(n)*T) on [N] of degree 
^ d, where the function F G Lip(*(G/r) — > C ) obeys two additional properties: 

• F G Lip(* (G/L) 5") (thus |F| = 1 pointwise, and hence \x\ = 1 
pointwise also); and 

• F(gdx) = e(r)(gd))F(x) for all x G G/r and ga G G<j, where ry : Gd -» R is 
a continuous standard homomorphism which maps Yd to the integers (or 
equivalently, r\ is an element of the Pontryagin dual of the torus Gd/Td). 
We call rj the vertical frequency of F. 

The space of all nilcharacters of degree d on [N] is denoted S d ([7V]). 

Example 6.2. When d = 1, the only examples of nilcharacters are the linear 
phases n h-» e(an + f3) for a,/3£ *R. 

Example 6.3. For any ao, . . . , ad, G *R, the function n !->• e(«o + • • • + ctd,n d ) is a 
nilcharacter of degree ^ d. To see this, we set G/T to be the unit circle T with the 
filtration Gi :— E for i ^ d and Gj := {0} for i > d (thus G/T is of degree d), let 
g(n) := ctQ + . . . + adn d , and let F(x) := e(x). The vertical frequency n : M — >• R is 
then just the identity function. 

Now we give an instructive near-example of a nilcharacter. Let G be the free 
2-step nilpotent Lie group on two generators e\, thus 

G := (ei,e 2 ) R = {e\ 1 e t 2 2 [e 1 , e 2 ]* 12 : ti,t 2> ti2 € R} (6.1) 

with the element [ei,e2] being central, but with no other relations between ei,e2 
and [ei, e 2 ]. This is a degree ^ 2 nilpotent group if we set Go, G± := G and 

G 2 := ([ei,e 2 ]) K = {[e 1 ,e 2 ] tl2 : ti 2 G K}. 

We let 

r := (e!,e 2 ) = {e^e™ 2 [e 1; e 2 ]™ 12 : ni,n 2 ,«i2 G Z} 

be the discrete subgroup of G generated by ei, e 2 , then G/r is a degree ^ 2 filtered 
nilmanifold, known as the Heisenberg nilmanifold, and elements of G/T can be 
uniquely expressed using the fundamental domain 

G/r = {e t 1 1 e^ 2 [e 1 ,e 2 ] tl2 r : t u t 2l t 12 G / := (-1/2,1/2]}. 

If we then set g : *Z — > *G to be the limit polynomial sequence g(n) := e^e"™ 
for some fixed a, (3 G *R, and let F : G/T — > C be the function defined on the 
fundamental domain by the formula 

^(e t 1 1 e 2 2 [e 1 ,e 2 ] tl2 r) := e(-t 12 ) (6.2) 

for ii,t 2 , ti 2 G /o, then one easily computes that 

F(g(n)*T) = e({an}Pn) 

where {} : R — > Iq is the signed fractional part function. The function n i— > 
e({an}/3n) is then almost a nilcharacter of degree 2, with vertical frequency given by 
the function rj : [ei, e2]' 12 n- — ii 2 - All the properties required to give a nilcharacter 
in Definition 16.11 are satisfied, save for one: the function F is not Lipschitz on 
all of G/r, but is instead merely piecewise Lipschitz, being discontinuous at some 
portions of the boundary of the fundamental domain. To put it another way, one 
can view n i-> e({cm}/3n) as a piecewise nilcharacter of degree 2. 
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Indeed, a topological obstruction prevents one from constructing any scalar func- 
tion F £ Lip(*(G/r) — > S 1 ) of unit magnitude on the Heisenberg nilmanifold with 
the above vertical frequency. By taking standard parts, we may assume that F 
comes from a standard Lipschitz function F : G/T — > S 1 with the same vertical fre- 
quency. For any standard t £ [—1/2, 1/2], consider the loop 7* := {e*e|T : s £ Iq}. 
The image F(j t ) of this loop lives on the unit circle and thus has a well-defined 
winding number (or degree). As this degree must vary continuously in t while re- 
maining an integer, it is constant in t; in particular, i* 1 (7-1/2) and ^(71/2) must 
have the same winding number. On the other hand, from the Baker-Campbell- 
Hausdorff formula (|3.2p we see that 

F{e\ /2 e s 2 T) = F{e- 1/2 e s 2 e 1 [e 1 ,e 2 ] s T) = e{s)F{e~ 1/2 e s 2 T) 

and so the winding number of ^(71/2) is one larger than the winding number of 
F (7—1/2); a contradiction. 

If however we allow ourselves to work with higher dimensions D, then this 
topological obstruction disappears. Indeed, let us take a smooth partition of 
unity 1 = 2fcLi^fe(*j s ) on where D £ N + and each ip k is supported in 
B k mod Z 2 , where B k is a ball of radius 1/100 (say) in R 2 . Then if we define 
F := (F 1 ,F 2 ,...,F D ), where 

F k (e\e s 2 [e u e 2 } u *T) := ip k (t,s)e{u) (6.3) 

whenever (t, s) £ *B k and u £ *M, with F k = if no such representation of the 
above form exists, then one easily verifies that F lies in Lip(*(G/T) — ► S 20 ^ 1 ) with 
the vertical frequency 77, and so the vector-valued sequence % : n n- F(g(n)*T) is 
a nilcharacter of degree 2. A computation shows that each component \k of this 
nilcharacter \ = (xi, • • • ,Xd) takes the form 

Xfe(n) = e({an - 6 k }/3n)ip k (n) 

for some offset 9 k £ *M and some degree 1 nilsequence ip k - Thus we see that \ is 
in some sense "equivalent modulo lower order terms" with the bracket polynomial 
phase n 1— > e({an}/3n). We refer to the vector- valued nilsequence \ as a vector- 
valued smoothing of the piecewise nilsequence n 1— > e({an}(3n); we will informally 
refer to this smoothing operation several times in the sequel when discussing further 
examples of nilsequences that are associated with bracket polynomials. 

Similar computations can be made in higher degree. For instance, bracket cubic 
phases such as n h-» e({{an}(3n}jn) orn4 e({an 2 }f3n) with a,fi, 7 £ *K can be 
viewed as near-examples of degree 3 nilcharacters (with the problem again being 
that F is discontinuous on the boundary of the fundamental domain), but there exist 
vector-valued smoothings of these phases which are genuine degree 3 nilcharacters. 
We will not detail these computations here, but they can essentially be found in 
[2"81 Appendix E] . More generally, one can view bracket polynomial phases of degree 
d as near-examples of nilcharacters of degree d that can be converted to genuine 
examples using vector-valued smoothings; this fact can be made precise using the 
machinery from [46] , but we will not need this machinery here. 

Remark. The above topological obstruction is quite annoying; it is the sole 
reason that we are forced to work with vector- valued functions. There are two 
other approaches to avoid this topological obstruction that we know of. One is 
to work with piecewise Lipschitz functions rather than Lipschitz functions. This 
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allows one in particular to build (piecewise) nilcharacters out of bracket polyno- 
mials. This is the approach taken in |28j : however, it requires one to develop a 
certain amount of "bracket calculus" to manipulate these polynomials, and some 
additional arguments are also needed to deal with the discontinuities at the edges 
of the piecewise components of the nilmanifold. Another approach is to work with 
randomly selected fundamental domains of the nilmanifold (cf. |20j ) which elim- 
inates topological obstructions, with the randomness being used to "average out" 
the effects of the boundary of the domain. While all three methods will eventually 
work for the purposes of establishing the inverse conjecture, we believe that the 
vector-valued approach introduces the least amount of artificial technicality. 

By definition, every nilcharacter of degree d is a nilsequence of degree ^ d. The 
converse is far from being true; however, one can approximate nilsequences of degree 
^5 d as bounded linear combinations of nilcharacters of degree d. More precisely, 
we have the following lemma. 

Lemma 6.4. Let if) G Nil d ([A r ] — > C) be a scalar nilsequence of degree d, and let 
e > be standard. Then one can approximate ip uniformly to error e by a bounded 
linear combination (over C) of the components of nilcharacters in S d ([A^]). 

Proof. Unpacking the definitions, it suffices to show that for every degree d filtered 
nilmanifold G/T, every F G Lip(*(G/T) C), and every standard e > 0, one can 
approximate F uniformly to error e by a bounded linear combination of functions in 
the class F(G/T) of components of standard Lipschitz functions F' G Lip(G/T — > 
S u ) that have a vertical frequency in the sense of Definition 16. II 

By taking standard parts, we may assume that F is a standard Lipschitz function. 
Observe that F(G/T) is closed under multiplication and complex conjugation. By 
the Stone- Weierstrass theorem, it thus suffices to show that _F(G/T) separates any 
two distinct points x,y G G/T. If x, y do not lie in the same orbit of the G<2, 
then this is clear from a partition of unity (taking r\ = 0). If instead x = g^y 
for some gd G Gd, then the distinctness of x, y forces go, ^ T^, and hence by 
Pontryagin duality there exists a vertical frequency r\ with r\(gd) ^ 0. If one then 
builds a nilcharacter with this frequency (by adapting the vector- valued smoothing 
construction (|6.3|1 ) we obtain the claim. □ 

We remark that this lemma can also be proven, with better quantitative bounds, 
by Fourier-analytic methods: see [211 Lemma 3.7]. As a corollary of the lemma, we 
have the following. 

Corollary 6.5. Suppose that f G L°°([N] — > C ). Then f is d-biased if and only 
if f correlates with a nilcharacter \ £ 3 (I-^])- 

It is easy to see that if x, x' are two nilcharacters of degree d, then the tensor 
product x®x' ancl complex conjugate % are also nilcharacters. If all nilcharacters 
were scalar, this would mean that the space S d ([iV]) of degree d nilcharacters form 
a multiplicative abelian group. Unfortunately, nilcharacters can be vector- valued, 
and so this statement is not quite true. However, it becomes true if one only focuses 
on the "top order" behaviour of a nilcharacter. To isolate this behaviour, we adopt 
the following key definition. 

Definition 6.6 (Symbol). Let d^O. Two nilcharacters x, x' G 3 d ([iV]) of degree 
d are equivalent if x®x' is equal on [N] to a nilsequence of degree < d— 1. This can 
be shown to be an equivalence relation (see Lemma IE.7|) ; the equivalence class of a 
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nilcharacter x will be called the symbol of x an d is denoted [x]s y mb d ([Af])- The space 
of all such symbols will be denoted Symb d ([7V]); we will show later (see LemmaEBJ 
that this is an abelian multiplicative group. 

When d = 1, two nilcharacters n h-> e(an + f3) and n n> e(a'n + [3') are equivalent 
if and only if a — a' is a limit integer, and Symb 1 ([A r ]) is just *T in this case. 
However, the situation is more complicated in higher degree. To get some feel for 
this, consider two polynomial phases 

X : n }-> e(a a + . . . + ad,n d ) 

and 

X' : n i-> e(a' + ... + a' d n d ) 
with q.q, . . . , ad, a' ,a' d £ *R, and consider the problem of determining when \ and 
x' are equivalent nilcharacters of degree d. Certainly this is the case if ay and a' d 
are equal, or differ by a limit integer. When d ^ 2, there are two further important 
cases in which equivalence occurs. The first is when a' d — ad + 0(N~ d ), because 
in this case the top degree component e((oy — a' d )n d ) of xx' can be viewed as a 
Lipschitz function of n/2N mod 1 (say) on [N] and is thus a 1-step nilsequence. 
The second is when a' d — ad + a/q for some standard rational q, since in this case 
the top degree component e((ad — a' d )n d ) of xx' is periodic with period q and can 
thus be viewed as a Lipschitz function of n/q mod 1 and is therefore again a 1-step 
nilsequence. We can combine all these cases together, and observe that x an d x' are 
equivalent when a' d = a d + a/q + 0(N~ d ) mod 1 for some standard rational a/q. It 
is possible to use the quantitative equidistribution theory of nilmanifolds (see pM] ) 
to show that these are in fact the only cases in which x an d x' are equivalent; this 
is a variant of the classical theorem of Weyl that a polynomial sequence is (totally) 
equidistributed modulo 1 if and only if at least one non-constant coefficients is 
irrational. In view of this, we see that Symb d (L¥]) contains *R/(*Z + Q + N- d R) 
as a subgroup, and the symbol of n >— > e(«o + • • • + adn d ) can be identified with 

a d mod 1, <Q>, 0(N- d ) := a + *Z + Q + N~ d M. 

However, the presence of bracket polynomials (suitably modified to avoid the 
topological obstruction mentioned earlier) means that when d ^ 2, that Symb d (LY]) 
is somewhat larger than the above mentioned subgroup. We illustrate this with 
the following (non-rigorous) discussion. Take d = 2 and consider two degree 2 
nilcharacters Xi x' °f the form 

x{n) ~ e({an}(3n + jn 2 ) 

and 

X'(n) w e({a'n}(3'n + -f'n 2 ) 
for some a, j3, 7, a', (3', 7' G *R, where we interpet the symbol ~ loosely to mean 
that x, x' are suitable vector- valued smoothings of the indicated bracket phases, of 
the type discussed earlier in this section. These may also involve some lower order 
nilsequences of degree 1. 

As before, we consider the question of determining those values of a, /3, 7, a' , /3' , 7' 
for which x an d x' are equivalent. There are a number of fairly obvious ways in 
which equivalence can occur. For instance, by modifying the previous arguments, 
one can show that equivalence holds when a = a', ft = j3', and 7 — 7' is equal to 
a limit integer, a standard rational, or is equal to 0(N~ 2 ). Similarly, equivalence 
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occurs when /3 = j3', 7 = 7', and a — a' is equal to a limit integer, a standard 
rational, or is equal to 0(N^ 1 ). 

However, there are also some slightly less obvious ways in which equivalence 
can occur. Observe that the expression e({an}{/3n}) is a Lipschitz function of the 
fractional parts of an and j3n and is thus a (piecewise) nilsequence of degree 1 (and 
will become a genuine nilsequence after one performs an appropriate vector-valued 
smoothing). On the other hand, we have the obvious identity 

e((cm — {cm})(/3n — {/3n})) = 1 

since the exponent is the product of two (limit) integers. Expanding this out and 
rearranging, we obtain the (slightly imprecise) relation 

e({an}f3n) » e(-{/3n}an + a/3n 2 ) (6.4) 

where we again interpret loosely to mean "after a suitable vector- valued smooth- 
ing, and ignoring lower order factors" . This gives an additional route for x and x' 
to be equivalent. A similar argument also gives the variant 

e({an} j3ri) w e(-a/3n 2 ) 

whenever a, j3 are commensurate in the sense that a//3 is a standard rational. We 
thus see that the notion of equivalence is in fact already somewhat complicated in 
degree 2, and the situation only becomes worse in higher degree. One can describe 
equivalence of bracket polynomials explicitly using bracket calculus, as developed 
in [46] (see also the earlier works [3j [3Q1 [3lJ [32] ) , but this requires a fair amount 
of notation and machinery. Fortunately, in this paper we will be able to treat the 
notion of a symbol abstractly, without requiring an explicit description of the space 
Symb^^]). 

More general types of filtration. The notion of a one-dimensional poly- 
nomial n h-> a.Q + . . . + ctd,n d of degree ^5 d can of course be generalised to higher 
dimensions. For instance, we have the notion of a multidimensional polynomial 

(ni,...,n k ) >-)• oti u ...,i h n^ . . . rig 

of degree ^ d. We also have the slightly different notion of a multidimensional 
polynomial 

(ni,...,n k ) ^ 22 a ii,-,ik n i ■■■ n% k 

ii,...,il t >0:i 3 -<dj for l^j^k 

of multidegree ^ (d\, . . . , dk) for some integers d\, . . . ,dk ^ 0. We can unify these 
two concepts into the notion of a multi-dimensional polynomial 

{ni, . . . ,nk) ^ ^ a iu-,ih n *i ■ ■ ■ n fc ( 6 - 5 ) 

(ii,...,i k )eJ 

of multidegree C J for some finite downset J C N k , i.e. a finite set of tuples 
with the property that (ii, . . . , ik) € J whenever . . • , ifc) G and ij ^ ij for 
all j = 1, . . . , k for some (i^, . . . ,i' k ) G J. Thus for instance the two-dimensional 
polynomial 

(h, ri) 1 — y ahn + f3hn 2 + jn 3 
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for a, j3, 7 G *R is of multidcgrce C J for 

J := {(0,0), (0,1), (0,2), (0,3), (1,0), (1,1), (1,2)}, 

and is also of multidegrec < (1,3) and of degree ^ 3. (One can view the downset 
J as a variant of the Newton polytope of the polynomial.) 

In our subsequent arguments, we will need to similarly generalise the notion of a 
one-dimensional nilcharacter n ^ x( n ) OI degree < d to a multidimensional nilchar- 
acter (m, . . . , rife) i->- x{ n ii . . . , n^) of degree ^ d, of multidcgrce $C (d±, . . . , dk), or 
of multidegree C J. We will define these concepts precisely in a short while, but 
we mention for now that the polynomial phase 

(h, n) i~> e(ahn + (3hn 2 + ^n 3 ) 

will be a two-dimensional nilcharacter of multidegree C J, multi-degree < (1,3), 
and degree < 3 where J is as above. Moreover, variants of this phase, such as (a 
suitable vector-valued smoothing of) 

(h, n) i-> e({aih}a 2 n + {{ftn} (3 2 h} (3 3 n + {jin 2 }~/ 2 n), 

will also have the same multidegree and degree as the preceding example. 

The multidegree of a nilcharacter x("i> • • • > n k) is a more precise measurement 
of the complexity of \ than the degree, because it separates the behaviour of the 
different variables n\, . . . , n*,. We will also need a different refinement of the no- 
tion of degree, this time for a one-dimensional nilcharacter n M> x(n), which now 
separates the behaviour of different top degree components of x, according to their 
"rank". Heuristically, the rank of such a component is the number of fractional 
part operations x H» {x} that are needed to construct that component, plus one; 
thus for instance 

n n> e(an 3 ) 

has degree 3 and rank 1, 

n i-> e({an 2 }(3n) 
has degree 3 and rank 2 (after vector- valued smoothing), 

n n> e({{an} (3n}jn) 

has degree 3 and rank 3 (after vector- valued smoothing) , and so forth. We will then 
need a notion of a nilcharacter \ of degree-rank ^ (d, r) , which roughly speaking 
means that all the components used to build \ either are of degree < d, or else are 
of degree exactly d but rank at most r. Thus for instance, 

n i V e({an}(3n + 7« 3 ) 

has degree-rank < (3, 1) (after vector- valued smoothing), while 

n i y e({an}(3n + "/n 3 + {Sn 2 }en) 

has degree-rank < (3,2) (after vector- valued smoothing), and 

n i ^ e({an}(3n + -fn 3 + {Sn 2 }en + {{^in}i'n}pn) 

has degree-rank < (3,3) (after vector- valued smoothing). 

In order to make precise the notions of multidegree and degree-rank for nilcharac- 
ters, it is convenient to adopt an abstract formalism that unifies degree, multidegree, 
and degree-rank into a single theory. We need the following abstract definition. 
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Definition 6.7 (Ordering). An ordering I = (I, -<, + , 0) is a set / equipped with 
a partial ordering -<, a binary operation + : I X / — > I, and a distinguished element 
06 / with the following properties: 

(i) The operation + is commutative and associative, and has as the identity 
element. 

(ii) The partial ordering -< has as the minimal element. 

(hi) If i,j £ I are such that i -< j, then i + k -< j + k for all k £ /. 

(iv) For every d G /, the initial segment {i £ I : i -< d} is hnitc. 
A finite downset in / is a finite subset J oi I with the property that j G J whenever 
j £ I and j -< i for some i £ J. 

In this paper, we will only need the following three specific orderings (with k a 
standard positive integer): 

(i) The degree ordering, in which / = N with the usual ordering, addition, 
and zero element. 

(ii) The multidegree ordering, in which / = N k with the usual addition and 
zero element, and with the product ordering, thus {i^, . . . ,i' k ) < . . . , ik) 
if i'j ^ ij for all 1 < j < fc. 

(iii) The degree-rank ordering, in which / is the sector DR := {(d, r) G N 2 : ^ 
r d} with the usual addition and zero element, and the lexicographical 
ordering, that is to say (d! , r') -< (d, r) if d' < d or if d' = d and r' < r. 

It is easy to verify that each of these three explicit orderings obeys the abstract 
axioms in Definition 16.71 In the case of the degree or degree-rank orderings, I is 
totally ordered (for instance, the first few degree-ranks are (0, 0), (1, 0), (1, 1), (2, 0), 
(2, 1), (2, 2), (3, 0), . . .), and so the only finite downsets are the initial segments. 
For the multidegree ordering, however, the initial segments are not the only finite 
downsets that can occur. 

The one-dimensional notions of a filtration, nilsequence, nilcharacter, and sym- 
bol can be easily generalised to arbitrary orderings. We give the bare definitions 
here, and defer the more thorough treatment of these concepts to Appendix iBl and 
Appendix [E] We will however remark that when I is the degree ordering, then all 
of the notions defined below simplify to the one-dimensional counterparts defined 
earlier. 

Definition 6.8 (Filtered group). Let / be an ordering and let G be a group. By 

an I -filtration on G we mean a collection Gi = (Gi)iei °f subgroups indexed by /, 
with the following properties: 

(i) (Nesting) If i, j £ I are such that i -< j, then Gi D Gj. 

(ii) (Commutators) For every i, j £ I, we have [Gi,Gj] C Gj+ 3 -. 

If d G /, we say that G has degree ^ d if Gi is trivial whenever i ^ d. More 
generally, if J is a downset in /, we say that G has degree C J if Gi is trivial 
whenever i £ J. 

Let us explicitly adapt the above abstract definitions to the three specific order- 
ings mentioned earlier. 

Definition 6.9. If (d\, . . . , dk) £ N fe , we define a nilpotent Lie group of multi- degree 
^ (d±, . . . , dk) to be a nilpotent /-filtered Lie group of degree ^ {d\, . . . , dk), where 
I = N fe is the multidegree ordering. Similarly, if J is a downset, define the notion 
of a nilpotent Lie group of multidegree C J. 
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If (d, r) G DR, define a nilpotent Lie group of degree-rank ^ (d, r) to be a 
nilpotcnt DR- filtered Lie group G of degree ^ (d, r), with the additional axioms 
G (o,o) = g and G (dfl) = G ((U) for all d > 1. 

We define the notion of a filtered nilmanifold of multidegree ^ (d\, . . . ,dfe), 
multidegree C J, or degree-rank ^ (d, r) similarly. 

Note that the degree-rank filtration needs to obey some additional axioms, which 
are needed in order for the rank r to play a non-trivial role. As such, the unification 
here of degree, multidegree, and degree-rank, is not quite perfect; however this 
wrinkle is only of minor technical importance and should be largely ignored on a 
first reading. 

Example 6.10. If G is a filtered nilpotent group of multidegree ^ (1, 1), then the 
groups and G(o.i) must be abelian normal subgroups of G( ), and their com- 

mutator [G( 10 ), G/0,1)] must lie inside the group G/^n, which is a central subgroup 

°f ^(0,0)- 

If G is a filtered nilpotent group of degree-rank ^ (d, d), then (G( i ))oo is a 
N-filtration of degree ^ d. But if we reduce the rank r to be strictly less than 
d, then we obtain some additional relations between the G( i ) that do not come 
from the filtration property. For instance, if G has degree-rank ^ (3,2), then the 
group [G(! o), [G(! o), G(! o)]] must now be trivial; if G has degree-rank ^ (3,1), 
then the group [Gnm, G( 2j o)] must also be trivial. More generally, if G has degree- 
rank ^ (d, r), then any iterated commutator of g^, . . . , g im with gj G G(j ) f° r 
j = 1, . . . , m will be trivial whenever ii + . . . + i m > d, or if %\ + . . . + i m = d and 
to > r. 

Example 6.11. If (Gi)i e ^ is an N-filtration of G of degree ^ d, then (G^)* eNk 

is an N^-filtration of G of multidegree C {i G N k : \i\ ^ d}, where we recall the 
notational convention ■ ■ . , ik)\ = . -+ik- Conversely, if J is a finite downset 
of N fc and (G^)^- Nfc is a N fe -filtration of G of multidegree c J, then 

v G ) 

is easily verified (using Lemma 13. 1| to be an N-filtration of degree ^ maxj e7 \i\, 
where V aGA G a is the group generated by UaeA G a- In particular, any multidegree 
^ (di, ■ ■ ■ , dk) filtration induces a degree ^ d\ + . . . + dk filtration. 

In a similar spirit, every degree-rank ^ (d, r) filtration (G^'^'j^d'.r'jeDR of a 
group G induces a degree ^ d filtration (G( i0 ))jeN- In the converse direction, if 
(Gj)igN is a degree $5 d filtration of G with G = Go, then we can create a degree-rank 
^ (d, d) filtration (G^/r/j^'^^gDR by setting G(,j',r') to be the space generated 
by all the iterated commutators of g^,... ,gi m with G G(,-. t o) for j = 1, . . . ,to 
for which either ii + . . . + « m > d', or ii + . . . + « m = d and to ^ max(r', 1); this 
can easily be verified to indeed be a filtration, thanks to Lemma [3~T1 

Example 6.12. Let d ^ 1 be a standard integer. We can give the unit circle 
T the structure of a degree-rank filtered nilmanifold of degree-rank ^ (d, 1) by 
setting G = E and V = Z with G (dv0 := K for (d', r') < (d, 1) and G {d , y) := {0} 
otherwise. This is also the filtration obtained from the degree ^ d filtration (see 
Example I4.3[) using the construction in Example 16.111 
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Example 6.13 (Products). If Gi and G'j are /-nitrations on groups G, G' then we 
can give the product G x G' an /-filtration in an obvious way by setting (G x G')i :— 
Gi x G\. The degree of G x G' is the union of the degrees of G and G' . Similarly the 
product G\/Y\ x G2/T2 of two /-filtered nilmanifolds is an /-filtered nilmanifold. 

Example 6.14 (Pushforward and pullback). Let tj> : G — > H be a homomorphism 
of groups. Then any any /-filtration Hj — (i/j)j e j of H induces a pullback I- 
filtration <fi*Hi := (<f> (Hi))i^i ■ Similarly, any /-filtration Gj — (Gj)i E j on G 
induces a pushforward I -filtration 0»G/ := {<j){Gi))i^i on H. In particular, if Y 
is a subgroup of G, then we can pullback a filtration G/ = (Gj)ig/ of G by the 
inclusion map t : T » G to create the restriction Yj :— (Yi)i e i of that filtration. 
It is a trivial matter to check that the subgroups of this filtration are given by 

r* :=rnG, 

Definition 6.15 (Filtered quotient space). A I-filtered quotient space is a quotient 
G/r, where G is an /-filtered group and T is a subgroup of G (with the induced 
filtration, see Example I6.I4|) . 

A I-filtered homomorphism (f> : G/T — > G'/T' between /-filtered quotient spaces 
is a group homomorphism : G — >• G' which maps T to T', and also maps Gi to 
G^ for all i G /. Note that such a homomorphism descends to a map from G/r to 
G'/T'. 

If G is a nilpotent /-filtered Lie group, and Y is a discrete cocompact subgroup 
of G which is rational with respect to G/ (thus Ti := Y n Gj is cocompact in Gi for 
each i G /), we call G/Y = (G/F, G/) an I-filtered nilmanifold. We say that G/r 
has degree < d or C J of G has degree < d or C J. 

Example 6.16 (Subnilmanifolds). Let G/Y be an /-filtered nilmanifold of degree 
C J. If H is a rational subgroup of G, then ///(// nF) is also a filtered nilmanifold 
degree C J (using Example [6T4]), with an inclusion homomorphism from H/(HnY) 
to G/Y; we refer to H/(H fl T) as a subnilmanifold of G/Y. 

We isolate three important examples of a filtered group, in which G is the additive 
group Z or Z fe . 

Definition 6.17 (Basic filtrations). We define the following nitrations: 

• The degree filtration on G = Z fc , in which / = N is the degree ordering 
and Gi = G for i ^ I and G; = {0} otherwise. In many cases k will equal 
1 or 2. 

• The multidegree filtration Z^ fc on G = Z fe , in which / = N k is the mul- 
tidegree ordering and G$ = Z fc , Gg t = {Si), i — 1, . . . , k, and G# = {0} 
otherwise, with ei, . . . , e* being the standard basis for Z fe ; 

• The degree-rank filtration Zdr on G = Z, in which / = DR is the degree- 
rank ordering and G^o) = (5(1,0) = Z and Gu,r) = {0} otherwise. 

Definition 6.18 (Polynomial map). Suppose that H and G are /-filtered groups 
with H = (H, +) abeliarB Then for any map g : H — » G we define the derivative 

<9ft5W := s(n + ^W^ 1 • (6.6) 

We say that 5 : // — > G is polynomial if 

d/n • ■ • dh m g{n) e G il+ ... + i m (6.7) 



4 This is not actually a necessary assumption; see Appendix [E] However, in the main body of 
the paper we will only be concerned with polynomial maps on additive domains. 
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for all m ^ 0, all i\, . . . , i m G I and all hj G Hi j for j = 1, . . . , m, and for all 
n e H . 

We denote by poly(Hj —> Gi) the space of all polynomial maps from Hi to G/. 
As usual, we use *poly (Hj — >• G/) to denote the space of all limit polynomial maps 
from * Hi to *Gj (i.e. ultralimits of polynomial maps in poly(_ff/ — > Gi)). 

Many facts about these spaces (in some generality) are established in Appendix 
[B] where, in particular, a remarkable result essentially due to Lazard and Leibman 
[HI 331131] is established: poly(i?/ — 5- Gi) is a group. The material in Appendix iBl 
is formulated in the general setting of abstract orderings / and for arbitrary (and 
possibly non-abelian) groups Hi, but for our applications we are only interested in 
the special case when Hj is Z or Z fe with the degree, multidegree, or degree-rank 
filtration as defined above. 

Before moving on let us be quite explicit about what the notion of a polynomial 
map is in each of the three cases, since the definitions take a certain amount of 
unravelling. 

• (Degree filtration) If H = Z fe with the degree filtration Z^j, then poly(Z^ — > 
Gfit) consists of maps g : Z fe — > G with the property that 

dht ■ ■■d, lm g(n) E G m 

for all m ^ 0, hi, ... , h m € Z and all n £ Gq. This space is precisely the 
same space as the one considered in [24l §6]. The space *poly(Z' £ — > Gn) 
is defined similarly, except that g : *Z fe — > *G is now a limit map, and 
all spaces such as Z and G m need to be replaced by their ultrapowers. 
(Similarly for the other two examples in this list.) 

• (Multidegree filtration) If H = Z fc with the multidegree filtration Z^ fc , 
then poly(Z^ fc — )• G N fe) consists of maps g : Z fe — > G with the property 
that 

■ ■ ■ d s Zm g{n) € Gg ii+ ... +Sim 

for all k > 0, all i\, . . . ,i m and all n G Z fc . To relate this space to the 
analogous spaces for the degree ordering, observe (using Example 16. lip 
that 

poly(Z^ -> (G<)<6n) = Poly(Z* fc -> (G |?| ) ?eNfc ) 
for any N-filtration (Gi)i S N, and conversely one has 

poly(4* (G ? ) ?£N C poly (l* -> ( \/ G-) ieN j 

for any N fc -filtration (Gj) j gP}fc . This is of course related to the obvious fact 
that a polynomial of multidegree ^ (d\, . . . , dk) is automatically of degree 
< di + . . . + dk- 

• (Degree-rank filtration) If H = Z with the degree-rank filtration Zdr, 
poly(ZDR — ► Gdr) consists of maps g : Z — > G with the property that 

dht ■ ..d hm g(n) G G (mj0 ) 

whenever m 0, hi, ... , h m G Z and n G Go- We observe (using Example 
I6.11|) the obvious equality 

poly(Z D R -> (G( d ,r))(d,r)enR) = poly(Z N -> (G (ii0 ))ieN) (6.8) 
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for any DR-filtration (G(d,r))(d,r)eDR- Thus, a degree-rank filtration Gdr 
on G docs not change the notion of a polynomial sequence, but instead 
gives some finer information on the group G (and in particular, it indicates 
that certain iterated commutators of the G( d .r) vanish, which is informa- 
tion that cannot be discerned just from the knowledge that (G( ii0 ))ieN is 
a N-filtration) . 

Definition 6.19 (Nilsequcnces and nilcharacters) . Let / be an ordering, and let 
J be a finite downset in /. Let H be an abelian /-filtered group. A (polynomial) 
nilscqucncc of degree C J is any function of the form 

X (n) = F(g(n)*r), 

where 

• G/r = (G/T, Gi) is an /-filtered nilpotent manifold of degree C J; 

• g G *poly(i// — > G[) is a limit polynomial map from *H[ to *G/; and 

• F e Lip(*(G/r) -^G"). 

The space of all such nilsequences will be denoted Nil cJ (*7/). We define the notion 
of a nilscquence of degree ^ d for some d G /, and the space Nil^ d (*iJ), similarly. 
If Q is a limit subset of *H, the restriction of the nilsequences in Nil cJ (*iJ) to SI 
will be denoted Nil cJ (f2), and we define Nil^ d (0) similarly. 

We refer to the map n i-> g(n)*T as a limit polynomial orbit in G/T, and denote 
the space of such orbits as *poly(Hj — > (G/r)/). 

Suppose that del. Then \ is sa id to be a degree d nilcharacter if \ is a degree 

d nilscquence with the following additional properties: 

• F G Lip(*(G/r) -> S") (thus \F\ = 1) and 

• F(gdx) = e(r/(gd))F(x) for all x G G/T and gd G Gd, where r\ : Gd — > K is 
a continuous standard homomorphism which maps Td to the integers. We 
call n the vertical frequency of F. 

The space of all degree d nilcharacters on *H will be denoted E d (*H). If is a limit 
subset of *H, the restriction of the nilcharacters in E d (*H) to £1 will be denoted 

E d (n). 

With the multidegree ordering, a degree (d\, . . . ,dk) nilcharacter will be re- 
ferred to as a multidegree (di, . . . ,dk) nilcharacter, and the space of such charac- 
ters on il denoted -^'^(fi); we similarly write Nil cJ (Sl) or m^ dl '-- dh \Sl) as 
Nil^ lti (fi) or Nil^ (dl '"-' dfc) (0) for emphasis. 

Similarly, with the degree-rank ordering, and assuming G/T is a filtered nilman- 
ifold of degree-rank < (d,r) (so in particular, we enforce the axioms G( 0> o) = G 
and G(d,o) = G^i)), a degree (d,r) nilcharacter will be referred to as a degree- 
rank (d, r) nilcharacter. The space of nilcharacters on 11 of degree-rank (d, r) will 
be denoted E^\fl) (note that this is distinct from the space E^^ d ?\fl) of two- 
dimensional nilcharacters of multidegree (di,^)), and the nilsequences on 11 of 
degree-rank < (d, r) will similarly be denoted Nilj^' r '(Jl). 

Example 6.20. Let J C N fe be a finite downset. Then any sequence of the form 

(n\ , . . . , Tife) 1— > F I a iu ... tik n\ 1 . . .n l £ mod 1 J , 

\(u,...,jfc)eJ / 
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where a^,..,,^ £ *R and F £ Lip(*T — > C ), is a nilsequence on Z k of multidegree 
C J, as can easily be seen by giving G := R the Z fc -filtration G; := R for ie J and 
Gj := {0} otherwise, and setting T := Z and g G *poly(Z' £ — > R) to be the limit 

polynomial n h-> £ (fl) .., iifc)eJ - "V • • • n t- 

For similar reasons, any sequence of the form 

(ni,...,n k ) i-> e ^ a, lj ... A n' 1 1 ...tiJ,'modl , 

\(ii,...,ifc)eN fc :n+...+ifc^d / 

is a degree d nilcharacter on Z fc of degree d, and any sequence of the form 
(ni,...,nfc) e ^ Qj,,...,,^'/ ...^ modi 

\(ii,...,i fe )eN fc :i :j sJd J for j=l,...,fc 

is a multidegree (di, . . . , dk) nilcharacter on Z fc . 

Example 6.21. Any degree 2 nilsequence of magnitude 1 is automatically a degree- 
rank ^ (3,0) nilcharacter, since every degree ^5 2 nilmanifold is automatically a 
degree-rank ^ (2, 2) nilmanifold, which can then converted trivially to a degree-rank 
5^ (3,0) nilmanifold (with a trivial group G^o)). Thus for instance for a,/3 £ R, 

n i— > e({an}/3n) 

is nearly a degree-rank (3, 0) nilcharacter, and becomes a genuine degree-rank (3, 0) 
nilcharacter after vector- valued smoothing. 
If a € *R, then the sequence 

n i— > e(cm 3 ) 

is a degree-rank (3, 1) nilcharacter. Indeed, we can give G = R a degree-rank 
^ (3, 1) filtration Gdr by setting Gu,r) K for (d,r) < (3, 1), and Gmj := {0} 
otherwise. 

Next, if a, /3 € *R, then the sequence 

n i — ^ e({an 2 }/3n) (6.9) 

is nearly a degree-rank (3, 2) nilcharacter (and becomes a genuinely so after vector- 
valued smoothing). To see this, let G be the Heisenberg nilpotent group (|6.1I) . 
which we give the following degree-rank filtration: 



C(2,2) 



One easily verifies that this is a degree-rank ^ (3, 2) filtration. If we then set 
g : *Z — > *G to be the limit sequence g(n) :— e^e"™ 2 , one easily verifies that g is 
a limit polynomial with respect to this degree-rank filtration. If one then lets F be 
the piecewise Lipschitz function (|6.2[) . then we see that 

F(g(n)*T) = e{{an 2 }(3n) 

and so we see that n H ► e({an 2 }/3n) is a indeed piecewise degree-rank (3, 2) nilchar- 
acter. 



G(o,o) 


— @(1,0) 




:= G 






G(2,0) 


= G (24) 


:= (ei,[ei,e 2 ]) R = {e^ 1 [ei, e 2 ]* 12 : 




G(3,0) 


= G (3) i) 


= G( 3i2 ) 


:= ([ei,e 2 ])K = {[ei,e 2 ]* 12 : *12 € 


R} 








:= {id} for all other (d, r) E DR. 
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A similar argument (using the free 3-step nilpotent manifold on three generators, 
which has degree ^ 3 and hence degree-rank ^ (3,3)) shows that 

n i y e({{an}/3n}jn) 

is nearly a degree-rank (3, 3) nilcharacter, and becomes a genuine degree-rank (3, 3) 
nilcharacter after applying vector-valued smoothing; see |28[ Appendix E] for the 
relevant calculations. 

These examples should help illustrate the heuristic that a degree-rank (d, r) 
nilcharacter is built up using (suitable vector-valued smoothings of) bracket mono- 
mials which either have degree less than d, or have degree exactly d and involve at 
most r — 1 applications of the fractional part operation. 

We observe (using Example 16. lip the following obvious inclusions: 

(i) A multidegree ^ (d\, . . . , dk) nilsequence on Z fc is automatically a degree 
^ di + . . . + dk nilsequence. 

(ii) A multidegree (di,...,dk) nilcharacter on Z fc is automatically a degree 
d± + . . . + dk nilcharacter. 

(hi) A multidegree («fi, . . . , dk-i, Q) nilsequence on Z fe is constant in the rik 
variable, and descends to a multidegree (di, ■ ■ ■ , dfc-i) nilsequence on Z fe_1 . 

(iv) A degree-rank ^ (rf, r) nilsequence on Z is automatically a degree ^ d 
nilsequence. 

(v) A degree ^ d nilsequence on Z is automatically a degree-rank ^ (d. d) 
nilsequence. 

(vi) A degree d nilcharacter on Z is automatically a degree-rank ^ (d, d) 
nilcharacter. 

It is not quite true, though, that a degree-rank (d, r) nilcharacter is a degree d 
nilcharacter if r > 1, because the former need not exhibit vertical frequency be- 
haviour for degree-ranks (d,r') with r' < r. 

Definition 6.22 (Equivalence and symbols). Let H be an /-filtered group, let 
del, and let 51 be a limit subset of *H. Two nilcharacters XiX' 6 S d (r2) are 
said to be equivalent if \ ® X 1 1S a nilsequence of degree strictly less than d. Write 
[x]symb d (fi) f° r tne equivalence class of x with respect to this relation; this we shall 
refer to as the symbol of \- Write Symb d (f2) for the space of all such equivalence 
classes. 

We write Symbj^ 1 l 1 J t ' i ''' 1 '^(f2) for the symbols of nilcharacters x G "Muiti " dk \ty 
of multidegree (d\, . . . , dk), and Symbp^(r2) for the symbols of nilcharacters x G 

Spj^(f2) of degree-rank (d, r). The basic properties of such symbols are set out in 
Appendix |E] 

7. A MORE DETAILED OUTLINE OF THE ARGUMENT 

Now that we have set up the notation to describe nilcharacters and their symbols, 
we are ready to give a high-level proof of Conjecture 15.51 fand hence Theorem 1 1.3[) . 
contingent on some key sub-theorems which will be proven in later sections. This 
corresponds to the realisation of points (i) , (ii) and (ix) from the overview in £j2] 

As the cases s = 1, 2 of this conjecture are already known, we assume that s ^ 3. 
We also assume inductively that the claim has already been proven for smaller 
values of s. Henceforth s is fixed. 
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Let / e L°°[N] be such that 

\\f\\u^[N] » 1. (7.1) 
Define / to be zero outside of [N]. Raising (|7.1j) to the power 2 S+1 , we see that 

E h6 [[JV]]l|Ah/||£. [Jtf] »1 

and thus 

\\*hf\\u'[N] » 1 

for all ft, in a dense subset H of [[N]]. Applying the inductive hypothesis, we thus 
see that Ahf is (s — l)-biased for all h e H. 

By definition, we now know that Ahf correlates with a nilsequence of degree (s— 
1). By Lemma \6. 51 we see that for each h £ H, Ahf correlates with a nilcharacter 
Xh £ S s_1 ([A r ]). It is not hard to see that the space of such nilcharacters is a 
cr-limit set (see Definition I A . 1 Ojl . so by Lemma [A . 1 2 1 we can ensure that %h depends 
in a limit fashion on h. 

The aim at this point is to obtain, in several stages, information about the 
dependence of \h on h. A key milestone in this analysis is a linearisation of \h on 
h. In the case s = 2, treated in [ElET], the Xh(n) were essentially just linear phases 
e(£/jn), and the outcome of the linearisation analysis was that the frequencies £/j 
may be assumed to vary in a bracket-linear fashion with h. In the case s = 3 
(treated in [25] but also dealt with in our present work), a model special case 
occurs when Xh{n) ~ e({a? l n}/?/ l n) (interpreting w loosely). The outcome of the 
linearisation analysis in that case was that at most one of ah, fih really depends on 
h, and furthermore that this dependence on h is bracket-linear in nature. 

Now we formally set out the general case of this linearisation process. 

Theorem 7.1 (Linearisation). Let f G L°°[N], let H be a dense subset of [[N]], 
and let (xh)h£H be a family of nilcharacters in S s_1 ([./V]) depending in a limit 
fashion on h, such that Ahf correlates with Xh for all h G H . Then there exists a 
multidegree (1, s — 1) -nilcharacter x G H^^ lt j (*Z 2 ) such that Ah f (s — 2) -correlates 
with x(ft, •) for many h G H . 

This statement represents the outcome of points (iii) to (vii) of the outline in [J2] 
and must therefore address the following points: 

• For some suitable notion of "frequency" , the symbol of Xh (n) contains only 
one frequency that genuinely depends on h; 

• That frequency depends on h in a bracket-linear manner; 

• Once this is known, it follows that, for many h, Ahf (s — 2)-correlates with 
X(h,n), where x is a certain 2-variable nilsequence. 

These three tasks are, in fact, established together and in an incremental fashion. 
The nilcharacter Xh (n) is gradually replaced by objects of the form x' (h, n) ® x'h ( n ) 
where x'(h, n) is a 2-dimensional nilcharacter of multidegree (1, s — 1) and, at each 
stage, the nilcharacter Xh( n ) (which has so far not been shown to vary in any 
nice way with h) is "simpler" than Xh(ri)- The notion of simpler in this context 
is measured by the degree-rank filtration, a concept that was introduced in the 
previous section. Thus the result of a single pass over the three points listed above 
is the following subclaim. 

Theorem 7.2 (Linearisation, inductive step). Let 1 ^ r* ^ s — 1, let f G L°°[N], 
let H be a dense subset of [[N]], let x G ^Muiti (*^ 2 )> ^ {XhjheH be a family of 
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nilcharacters of degree-rank (s — 1, 7 1 *) depending in a limit fashion on h, such that 
Ahf (s — 2)-correlates with x(^j ') ® Xh for all h € if. T/ien there exists a dense 
subset H' of H , a multidegree (\, s~l)-nilcharacter x' £ ^Multi^ (*^ 2 ) an ^ a family 
(x'h)heH of nilcharacters of degree-rank (s — 1, r* — 1) depending in a limit fashion 
on h, such that Ahf (s — 2)-correlates with x'Q 1 , •) ® X/j / or a ^ ^ G -H"'- 

Theorem 17.11 follows easily by inductive use of this statement, starting with 
equal to s — 1 and using Theorem 1 7 . 2 1 iterativelv to decrease r* all the way to zero. 

To prove Theorem 17. 2\ we follow steps (iii) to (vii) in the outline quite closely. 
The first step, which is the realisation of (iii), is a Gowers-style Cauchy-Schwarz 
inequality to eliminate the function / as well as the 2-dimensional nilcharacter 
x(h, n) and therefore obtain a statement concerning only the (so far) unstructured- 
m-h object Xh( n )- Here is a precise statement of the outcome of this procedure; 
the proof of this proposition is the main business of £j8j 

Proposition 7.3 (Gowers Cauchy-Schwarz argument). Let f,H,x, (Xh)heH oe as 
in Theorem \7.2\ Then the sequence 

n H> Xh! (n) <E) Xh 2 (n + ht- /i 4 ) ® Xh 3 (n) 55 Xh A (n + hi- hi) (7.2) 
is (s — 2)-biased for many additive quadruples (hi, h%, /13, /14) in H . 

With this in hand, we reach the most complicated part of the argument. This 
is the use of Proposition 17.31 to study the "frequencies" of the nilcharacters Xh an d 
the way they depend on h. Roughly speaking, the aim is to interpret the tensor 
product (|7.2[) as a nilsequence itself (depending on hi, hi, h 3 , hi) and use results 
from [23] to analyse its equidistribution and bias properties. 

To make proper sense of this one must first find a suitable "representation" 
of the Xh(n) in which the frequencies are either independent of h, depend in a 
bracket-linear fashion on h, or are appropriately dissociated in h, in the sense that 
the frequencies associated to (|7.2|) are "linearly independent" for most additive 
quadruples hi + /12 = ^3 + h±. This task is one of the more technical part of the 
papers and is performed in in it incorporates the additive combinatorial step 
(vi) of the outline from Jj2] The precise statement of what we prove is Lemma llO.101 
the "sunflower decomposition" . 

The representation of the Xh (and hence of (|7.2p ) involves constructing a suit- 
able polynomial orbit on something resembling a free nilpotent Lie group G; this 
device also featured in [551 §5]- Once this is done, one applies the results from [55J 
to examine the orbit of this polynomial sequence on the corresponding nilmanifold 
G/r. The results of [21] assert (roughly speaking) that this orbit is close to the 
uniform measure on a subnilmanifold HT/T, where H ^ G is some closed sub- 
group. In i Uli we then crucially apply a commutator argument of Furstenberg and 
Weiss that exploits some equidistribution information on projections of H to say 
something about this group H . The upshot of this critical phase of the argument is 
that the /i-dependence of the frequencies of Xh cannot be dissociated in nature, and 
must instead be completely bracket-linear; the precise statement here is Theorem 

En] 

At this point in the argument, we have basically shown that the top-order be- 
haviour (in the degree-rank order) of the nilcharacters Xh(n) is bracket-linear in 
h. To complete the proof of Theorem 17.21 (and hence of Theorem 17. 1|) it remains 
to carry out part (vii) of the outline, that is to say to interpret this bracket-linear 
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part of Xh{n) as a multidegree (1, s — 1) nilcharacter x'(^-> n )- This is the first part 
of the argument where some sort of "degree s nil-object" is actually constructed, 
and is thus a key milestone in the inductive derivation of GI(s) from GI(.s — 1). 
As remarked previously, our construction here is a little more conceptual (and ab- 
stractly algebraic) than in previous works, which have been somewhat ad hoc. The 
construction is given in £ | 1 2L At the end of that section we wrap up the proof of 
Theorem 17. II by this point, all the hard work has been done. 

With Thcorcm l7.1l in hand, we have completed the first seven steps of the outline. 
The only remaining substantial step is step (viii), the symmetry argument. Here is 
a formal statement of it: 

Theorem 7.4 (Symmetrisation). Let f G L°°[N], let H be a dense subset of[[N]], 

and let \ £ "Muiti (*^ 2 ) ^ e suc ^ that Ahf < s — 2-correlates with xQ 1 ) ■) for all 
h £ H . Then there exists a nilcharacter £ S S (*Z) (with the degree filtration) and 
a nilsequence \& £ Nil^ ulti (*Z 2 ), with J C N 2 given by the downset 

J := {(i,j) G N 2 :i + j < s - l}U{(i,s-i) :2^i^s}, (7.3) 

such that x(/i, n) is a bounded linear combination of 0(n + h) ® 0(n) ® ^(h, n). 

The proof is given in §131 Informally, this theorem asserts that the multidi- 
mensional degree (l,s — 1) nilcharacter %(/i,n) can be expressed as a derivative 
0(n + h) <8) 0(n) of a degree s nilcharacter 0, modulo "lower order terms", which 
in this context means multidimensional nilsequences ty(h,n) that either have total 
degree ^ s — 1, or are of degree at most s — 2 in the n variable. 

The remaining task for this section is to show how to complete the proof of 
Conjecture 15.31 (and Theorem II. 3j) from this point. From the discussion at the 
beginning of this section, we have already arrived at a situation in which the given 
function / £ L°°[N] has the property that Ahf correlates with \h for all h in a 
dense subset H of [[N]], where (Xh)heH be a family of nilcharacters in S s ~ 1 ([iV]) 
depending in a limit fashion on h. 

From Theorem IP and TheoremEHwe see that for many h £ [[N]], A h f ^ s — 2- 
correlates with the sequence 

n ^ 0(ri + h)<2) 0(n) ® &(h, n). 

The next step is to break up J and $ into simpler components, and our tool for this 
purpose shall be Lemma fE.4l Applying this lemma for e sufficiently small, followed 
by the pigeonhole principle, one can thus find scalar- valued nilsequences tp,tp' on 
*Z 2 (with the multidegree filtration) of multidegree 

c{(i,0)£N 2 :Ks-l} 

and 

C £ N 2 :i < s-2;i+j < s} 

respectively, such that for many h £ [[N]], Ahf ^ (s — 2)-correlates with 

n i y 0(n + h) ® Q(n)ip(h, ri)ip'(h, n). 

For fixed h, the nilsequence ip'(h,n) has degree $C s — 2 and can thus be ignored. 
Also, ip(h,n) — ip(n) is of multidegree ^ (s — 1,0) and is thus independent of h, 
with n >-> ip(n) being a degree < s — 1 nilsequence. Thus, for many h £ [[N]], Ahf 
^ s — 2-correlates with 

n H' 0(n + h) ® Q(n)ip(n). 
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Applying the pigeonhole principle again, we can thus find scalar nilsequences 9, 9' £ 
Nil^ s (*Z) such that for many h £ \[N}\, A h f < (s - 2)-correlates with 

6(n + h)6'(n) 

(indeed one takes 9,9' to be coefficients of <d and 9i/> respectively). Applying the 
converse to GI(s) (Proposition 15. 6p . we conclude 

\\f8(. + h)W(-)\\u>-HN] »1 
for many h £ H. Averaging over h (using Corollary IA.6I to obtain the required 
uniformity), we conclude that 

E he[[N]] \\f9(- + h)W(-)\\u^ [N] » 1- 

Applying the Cauchy-Schwarz-Gowers inequality (see e.g. [STJ Equation (11.6)]) we 
conclude that 

and hence by the inductive hypothesis (Conjecture 15.51 for s — 1), f9 is ^ (s — 1)- 
biased. Since 9 is a degree ^ s nilsequence, we conclude that / is ^ s-biased, as 
required. This concludes the proof of Conjecture 15.51 Conjecture 15.31 and hence 
Theorem 11.31 contingent on Theorem 17.11 and Theorem 17.41 

8. A VARIANT OF GOWERS'S CAUCHY-SCHWARZ ARGUMENT 

The aim of this section is prove Proposition l7.3l Thus, we have standard integers 
l^r*<s— l,a function / £ L°°[N], a dense subset H of [[N]], a two-dimensional 

nilcharacter x £ ^Multi (*^ 2 ) °^ multidegree (1, s — 1), and a family (xh)heu of 
nilcharacters of degree-rank (s — l,r*) depending in a limit fashion on h. We are 
given that Ahf (s — 2)-correlates with x(h, ■) ® Xh for all h £ H . Our objective is 
to show that, for many additive quadruples (hi, hi, h^, hi) in H, the expression 

n i-> Xh x {n) ® Xh 2 (n + hi - h A ) ® Xh 3 (n) 55 X/u( n + hi - h A ) (8.1) 

(where we extend the Xh by zero outside of [N]) is (s — 2)-biased. 

The strategy, following the work of Gowers [16], is to start with the $5 s — 2- 
correlation between Ahf and xQ 1 , -)Xh and then apply the Cauchy-Schwarz in- 
equality repeatedly to eliminate all terms involving /, x(h, ■), finally arriving at a 
correlation statement that only involves Xh (and lower order terms). 

Unfortunately, there is a technical issue that prevents one from doing this di- 
rectly, namely that the behaviour of x{h, •) in h is not quite linear enough to ensure 
that these terms are completely eliminated by a Cauchy-Schwarz procedure. In 
order to overcome this issue, one must first prepare x m to a better form, as fol- 
lows. We need the following technical notion (which will not be used outside of this 
section): 

Definition 8.1. A linearised (1, s — Y)-function is a limit function x '■ (h, n) — > C 
which has a factorisation 

X (h,n) = e(n) h iP(n) (8.2) 
where ip £ L°°(Z -> C") and c £ L°°(Z -> S 1 ) are such that, for every h,l £ Z, the 
sequence 

c(n- l) h c(n) 

is a degree ^ s — 2 nilsequence. 
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Remark 8.2. Hcuristically, one should think of a linearised (l,s — l)-function as 
(a vector- valued smoothing of) a function of the form 

(h,n) h4 e{P(n) + hQ{n)) 

where P, Q are bracket polynomials of degree s — 1; for instance, 

(h, n) h4 e({an}f3n + {jn}Snh) 

is morally a linearised (1,2) function. This should be compared with more general 
multidegree (1,2) nilcharacters, such as 

(h,n) i— s* e({{ah} j3n\^n) 

which are not quite linear in h because the dependence on h is buried inside one or 
more fractional part operations. Intuitively, the point is that one can use the laws 
of bracket algebra (such as (|6.4p ) to move the h outside of all the fractional part 
expressions (modulo lower order terms). While one can indeed develop enough of 
the machinery of bracket calculus to realise this intuition concretely, we will instead 
proceed by the more abstract machinery of nilmanifolds in order to avoid having 
to set up the bracket calculus. 

The key preparation for this is the following. 

Proposition 8.3. Let x G ^Multi (*Z 2 ) be a two-dimensional nilcharacter of mul- 
tidegree (1, s — 1), and let e > be standard. Then one can approximate \ to within 
e in the uniform norm by a bounded linear combination of linearised (l,s — 1)- 
functions. 

Proof. From Definition 16.11 we can express 

X (h,n) =F(g(h,n)*r) 

where G/Y is a N 2 -filtered nilmanifold of multidegree ^ (1, s — 1), g G *poly(Z^ 2 -4 
Gpp) (with 1? being given the multidegree filtration Z^ 2 ), and F G Lip(*(G/T) — > 
S^) has a vertical frequency rj : Gn tS _i) -4 M. 

We consider the quotient map ir : G/Y — > Gj (G^ o^Y) from G/Y onto the nil- 
manifold G/(G(i fl)Y), which can be viewed as an N-filtered nilmanifold of degree 
^ s — 1 (where we N-filter G/Gnm using the subgroups Gr 0t ^Gn \/Gn i0 )), The 
fibers of this map are isomorphic to T := Gnm/Ynm. Observe that Gnm is 
abelian, and so T is a torus; thus G/Y is a torus bundle over G/(G( 1 )T) with 
structure group T . The idea is to perform Fourier analysis on this large torus T, 
as opposed to the smaller torus G(i jS -i)/r(i )S _i), to improve the behaviour of the 
nilcharacter \- 

We pick a metric on the base nilmanifold G/(G(imr) and a small standard 

radius S > 0, and form a smooth partition of unity 1 = Ylk=i f k 011 ^7(^(1,0)^), 
where each ipk G Lip(G/(G( 10 )F) — > C) is supported on an open ball Bk of radius 
r. This induces a partition x = 53fc=i Xk, where 

Xk(h,n) = F{g{h,nyY) Vk {K{g{h,n)*Y)). 
Now fix one of the k. Then we have 

Xk(h,n) = F k (g(h,n)*T) 
where F k is compactly supported in the cylinder n^ 1 (Bk). 
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If r is small enough, we have a smooth section i : B k —> G that partially inverts 
the projection from G to G/(Gnfi\T), and so we can parameterise any element x 
of 7r _1 (_Bfc) uniquely as L(xo)tT for some xq G Bk and i e T (noting that tT is 
well-defined as an element of G/T). Similarly, we can parameterise any element of 
*ir~ 1 (B k ) uniquely as b(xo)tT for x G *B k and t G *T. 

We can now view the Lipschitz function F k 6 Lip(*(G/T)) as a compactly 
supported Lipschitz function in Lip(*(Bfc x T)). Applying a Fourier (or Stonc- 
Weierstrass) decomposition in the T directions (cf. Lemma IE. 51) , we thus see that 
for any standard e > we can approximate F k uniformly to error e/K by a sum 

J2k'=x Fk,k' > where K' is standard and each F kik < £ Lip(*(B k x T)) is compactly 
supported and has a character : T — > T such that 

F fc)fc / (t(a:o)tr) = e(&; (t(^o)r) (8.3) 

for all £ *(2B k ) and t 6 *T. It thus suffices to show that for each fc, k', the 
sequence 

Xk,h> ■ (h,n) H' F k ,k'(g(h,n)*T) 

is a linearised (1, s — l)-function. 

Fix fc, fc'. Performing a Taylor expansion (Lemma IB.9|) of the polynomial se- 
quence g G *poly(Z^ 2 — > Gpp), we may write 

g(h,n) = g (n)g 1 (n) h 

where go G *poly(ZN — > Gn) is a one-dimensional polynomial map (giving G 
the N-filtration Gn := (G(j j o))ieN) ) and <?i G *poly(Z — s> (G(i ) q))n) is another 
one-dimensional polynomial map (giving the abelian group Gri o) the N-filtration 
(G(i.o))n := (G(i ) j))igN)- In particular, we see that Xk,k' (h, n) is only non- vanishing 
when ir(go(n)*r) G B. Furthermore, in that case we see from (|8.3[) that 

Xk,k'{h,n) = e(h£(gi(n) mod T^ t0 )))F k ,k' (go(n)*T), (8.4) 

which gives the required factorisation (|8.2j) with c(n) := e(^(gi(n) mod IVi o))) and 
V>W := F ktk ,(g (n)*T). 

The only remaining task is to establish that for any given h,l, the sequence 

n i— > c(n — l) h c(n) is a degree ^ s — 2 nilscquence. We expand this sequence as 

n h-> e(/i(£(#i(n - Z) mod r (lj0) ) - £(#i(n) mod r (1)0 )))) 

But from the abelian nature of G^o), the map n H > £(gi(n.) mod r( 10 )) is a poly- 
nomial map from *Z to *T of degree at most s — 1, and the claim follows. □ 

We now return to the proof of Theorem 17.31 With this multiplicative structure, 
we can now begin the Cauchy-Schwarz argument. By hypothesis, for each h G H 
we can find a scalar nilsequence iph of degree ^ s — 2 such that 

\^,ie[N]^hf(n)x(h,n)(g)Xh(n)iph(n)\ » 1. 

By Corollary I A. 121 we may ensure that iph varies in a limit fashion on h. Applying 
Corollary I A. 6[ this lower bound is uniform in h. 

Applying Proposition I8.3I (with a sufficiently small e) and using the pigeonhole 
principle, we may then find a linearised (1, s — l)-function (h, n) i— > c(n) ip(ri) such 
that 

\E ne[N] A h f{n)c{n)- h iAnj ® Xh(n)ip h {n)\ > 1. 



38 



BEN GREEN, TERENCE TAO, AND TAMAR ZIEGLER 



By Corollary again, the lower bound is still uniform in h. We may then average 
in h (extending iph , Xh by zero for h outside of H ) and conclude that 

^hei[N]]\^neiN]Ahf{n)c(n)- h ip(n) <g) Xh(n)ip h (n)\ > 1, 

thus there exists a scalar function b e L°°[[iV]] such that 

\^ he [[N]]^ne[N]b{h)f(n + h)J(n)c(n)- h: ^(giXh(n)Mn)\ > 1. 

By absorbing b(h) into the ^ factor, we may now drop the b{h) factor. We write 
n + h = m and obtain 

|E me [Ar]/(m)E feE [[ JV ]]c(m - h)~ h f'(m - h) <g> - h)ip h (m - h)\ > 1 

where /' := (recall that / is extended by zero outside of [iV]), which by Cauchy- 
Schwarz implies that 

\^me[N]^h,h'e[[N]]c(m - h)- h c(m - h') h 'f'{m - h) ® f{m-h') 

®Xh(m - h)(g) Xh'{m ~ h')ip h (m - h)ip h ,(m - h')\ > 1. 

Making the change of variables h' = h + Z, n = m — h, we obtain 

|E^ ;e [[2 J v]];„e[Ar]c(n)-' i c(n - /)' l+; /'(n) ® J{n - I) 

®Xh{n) ® Xh+i(n - l)iph(n)iph+i(n - l)\ > 1. 

We then simplify this as 

|%,ie[[2JV]];nG[AT]C2(i, n) O X/»(») ® ~ O^/vMI > 1 ( 8 - 5 ) 

where 

c 2 (l, n) := c(n - /)'/'(«) ® f'(n-l) 

iphA n ) = c ( n - l) h c(n)~ h tp h (n)ijj h+ i(n - I) 

Clearly C2 is bounded. As for iph,U we see horn Definition 18.11 and Corollary IE.2I 
that tj)h,i is a nilsequence of degree ^ s — 2 for each h, I. 

Returning to (|8.5p . we use the pigeonhole principle to conclude that for many 
k e [[2N]}, we have 

\^hell2N]];n£[N]C2(k,n) ® Xft(«) ®XM-fe( n - k)ijj h . k (n)\ > 1. 

Let A; be such that the above estimate holds. Applying Cauchy-Schwarz in the n 
variable to eliminate the C2(k,n) term, we have 

\^h,h'e[[2N]]-ne[N]Xh(n) ® Xh+k(n - k) ® Xft'M ® Xh'+fc(™ - k)ip hik (n)\ > 1 
and thus for many fc, /i, ft,' e [[2iV]], we have 

|E„ e [jv]X/i( n ) ® Xh+k{n -k)(g) Xh>(n) ® Xh'+k(n ~ k)ip hyk (n)\ > 1, 
which implies that 

n !-> X/iW ® Xh+k{n - fc) <g> Xh'{n) ® Xh'+k(n - k) 

is (s — 2) -biased on [N]. Note that this forces h, h + k, h' , h! + k to be an addi- 
tive quadruple in H, as otherwise the expression vanishes. Applying a change of 
variables, we obtain Proposition 17.31 

For future reference we observe that a simpler version of the same argument (in 
which the x an d 4>h factors are not present) gives 



AN INVERSE THEOREM FOR THE GOWERS U 3 + 1 [JVJ-NORM 39 

Proposition 8.4 (Cauchy-Schwarz). Let f G L°°[N], let H be a dense subset of 
[[N]], and suppose that one has a family of functions Xh £ L°°(*Z) depending in a 
limit fashion on h, such that Ahf correlates with \h on [TV] for all h G H . Then 
for many {i.e. for 3> A^ 3 ) additive quadruples {h\,h2,h^,hi) in H, the sequence 

n i-> Xhi ( n ) ® Xh 2 (n + hi - /14) ® Xh 3 (n) ® Xh A (n + hi- fo 4 ) (8.6) 

zs biased. 

This proposition in fact has quite a simple proof; see |29j . Note how we can 
conclude (|8.6[) to be biased and not merely (s — 2)-biased. As such, Proposition l8.4l 
saves some "lower order" information that was not present in Proposition 17.31 this 
lower order information will be crucial later in the argument, when we establish the 
symmetry property in Theorem 17.41 

9. Frequencies and representations 

We will use Proposition 17.31 to analyse the "frequency" of the nilcharacters 
{Xh)heH appearing in Theorem 17.21 To motivate the discussion, let us first sup- 
pose that we are in the (significantly simpler) s = 2 case, rather than the ac- 
tual case s ^ 3 of interest. When s = 2, we can represent Xh a s a linear phase 
Xh{n) = e(^hn + 9h) for some 8^ G *T; one can then interpret £/, as the frequency 
oih. 

In order to describe how this frequency £/j behaves in h, it will be convenient to 
represent £/, as a linear combination 

£h = a\,h£,i,h + • ■ • + arjj^D^ (9.1) 

of other frequencies . . . , £d,/j £ where the a^/i G Z are (standard) integer 
coefficients, and the (£i,h)heH are families of frequencies which have better prop- 
erties with regards to their dependence on h; for instance, they might be "core 
frequencies" — that are independent of h, or they might be "bracket-linear 
petal" frequencies that depend in a bracket- linear fashion on h, or they might be 
"regular petal" frequencies which behave in a suitably "dissociated" manner in h. 
We can schematically depict the relationship (|9.ip as 

[Xh] ~ Vh(Fh) 

where [xh] is some sort of "symbol" of \h (which, in the linear case s = 2, is just 
£h mod 1), T h G *T D is the frequency vector T h = (Ci,h, £,D,h), and r) h : *T D -)• 
*T is the vertical frequency 

r)h(xi, . . . , x D ) := a\,h,xi + • • • + a D . h x D . (9.2) 

We will need to find analogues of the above type of representation in higher 
degree s ^ 3. Hcuristically, we will wish to represent the symbol bd («-i.r.)/ rA m of 
a nilcharacter x on [N] of degree-rank (s — 1,7%) (which will ultimately depend on 
a parameter h, though we will not need this parameter in the current discussion) 
heuristically as 

^ B ^ {[N]) ~V(^) (9-3) 

where T = {£i,j)i^.i^.s— * s a horizontal frequency vector of frequencies 
G *T associated to a dimension vector D = (Di, . . . , D s _i), and 77 is a ver- 
tical frequency that generalises (|9.2p , but whose precise form we are not yet ready 
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to describe precisely. We then say that the triple {D,tj, T) forms a total frequency 
representation of X- 

In the previous paper [55] that treated the s = 3 case, such a representation 
was implicitly used via the description of degree-rank (2, 2) nilcharacters Xh as 
essentially being bracket quadratic phases e(^2j =1 {cth,jn}f3hjn) modulo lower order 
terms (and ignoring the issue of vector- valued smoothing for now). In our current 
language, this would correspond to a dimension vector D = (2 J, 0) and a horizontal 
frequency vector of the form (a^.i, ■ ■ • , an, J> Ph,i> ■ • - > fih,j), and a certain vertical 
frequency rj depending only on J that we are not yet ready to describe explicitly 
here. Bracket-calculus identities such as (I6.4|) could then be used to manipulate 
such a universal frequency representation into a suitably "regularised" form. 

In principle, one could also use bracket calculus to extract the symbol of Xh in 
terms of frequencies such as ah.j and fih. j for higher values of s. However, as we are 
avoiding the use of bracket calculus machinery here, we will proceed instead using 
the language of nilmanifolds, and in particular by lifting the nilmanifold Gh/^h up 
to a universal nilmanifold in order to obtain a suitable space (independent of h) in 
which to detect relationships between frequencies such as ah.j,/3hj- In some sense, 
this universal nilmanifold will play the role that the unit circle T plays in Fourier 
analysis. 

We first define the notion of universal nilmanifold that we need. 
Definition 9.1 (Universal nilmanifold). A dimension vector is a tuple 

D = (D 1 ,...,D s _ 1 )eN s - 1 

of standard natural numbers. Given a dimension vector, we define the universal 
nilpotent group G D = G D '^^ s ~ 1 ' r * s> of degree-rank (s — to be the Lie group 

generated by formal generators ejj for 1 ^ i ^ s — 1 and 1 ^ j ^ Di, subject to 
the following constraints: 

• Any (to — l)-fold iterated commutator of ei 1 j 1 , . . . , &i m ,j m with «]_ + ...+ 
i m ^ s is trivial. 

• Any (to — l)-fold iterated commutator of ei 1 j 1 , . . . , e% m ,j m with ix + . . . + 
im = s — 1 and to ^ r + 1 is trivial. 

We give this group a degree-rank filtration (G? d r ))(d,r)eDR by defining G® d ^ to be 
the Lie group generated by (to — l)-fold iterated commutators of e^jj, . . . , ei m- j m 
with 1 ^ i\ ^ s— 1 and 1 ^ j\ ^ for all 1 ^ I $J n for which either . .+i m > d 1 
or i% + . . . +i m = d and m ^ r. It is not hard to verify that this is indeed a filtration 
of degree-rank < (s — l,r«). We then let T D be the discrete group generated by 
the eij with 1 ^ i ^ s — 1 and 1 ^ j ^ D„ and refer to G D /T D as the universal 
nilmanifold with dimension vector D. 

A universal vertical frequency at dimension vector D is a continuous homomor- 
phism j] : G^ ) s _ 1 r — > K which sends L^ -:L r ^ to the integers (i.e. a filtered 
homomorphism from GPj./TPj. to T). 

Remark. One can give an explicit basis for this nilmanifold in terms of certain 
iterated commutators of the eij, following [46J [49] . This can then be used to 
relate nilcharacters to bracket polynomials, as in [3^, and it is then possible to 
develop enough of a "bracket calculus" to substitute for some of the nilpotent 
algebra performed in this paper. However, we will not proceed by such a route here 
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(as it would make the paper even longer than it currently is), and in fact will not 
need an explicit basis for universal nilmanifolds at all. 

Example 9.2. The unit circle with the degree ^ d filtration (see Example I4.3[) is 
isomorphic to the universal nilmanifold G^ ''"'°' 1 ^'^ d ' 1 \ thus for instance the unit 
circle with the lower central series filtration is isomorphic to G^ 1 "^ 1,1 ' . A universal 
vertical frequency for any of these nilmanifolds is essentially just a map of the form 
1] : x I— > nx for some integer n. 

Example 9.3. The Heisenberg group (|6.1[) (with the lower central series filtra- 
tion) is the universal nilpotent group G*- 2 ' ' = G^ 2 ' ''^ 2 ' 2 - 1 of degree-rank (2,2) 
(after identifying ei,e2 with en and ei 2 respectively), and the Heisenberg nil- 
manifold G/T is the corresponding universal nilmanifold G^ 2 ' ^ /T^ 2 '°K If we reduce 
the degree-rank from (2,2) to (2,1), then the commutator [ei,ea] now trivialises, 
and G^ 2 ' ''^ 2 ' 1 * 1 collapses to the abelian Lie group M 2 = G 2 '^ 1 ' 1 ', with universal 
nilmanifold T 2 . 

If, instead of the lower central series filtration, one gives the Heisenberg group 
(|6.ip the filtration used in Example 16.211 to model the sequence (|6.9[) , then this 
group is isomorphic to the universal nilpotent group G^ 1 ' 1 )'^ 3 ' 2 ), with the two 
generators ei,e2 of the Heisenberg group now being interpreted as e± t i and e2,i 
respectively. 

Example 9.4. Consider the universal nilpotent group G^ Dl ' D2 ' D3 ^^ 3 ' s > . This 
group is generated by "degree 1" generators e^i, . . . , ei^, "degree 2" generators 
e2,i, • ■ ■ , e2,D 2 , and "degree 3" generators ea t i, . . . , e3,D 3 , with any iterated com- 
mutator of total degree exceeding three vanishing (thus for instance the degree 3 
generators are central, and the degree 2 generators commute with each other). If 
one drops the degree-rank from (3, 3) to (3, 2), then all triple commutators of de- 
gree 1-generators, such as [[ei,i, ei,j], &i,k] now vanish, reducing the dimension of the 
nilpotent group. Dropping the degree-rank further to (3, 1) also eliminates the com- 
mutators of degree 1 and degree 2 generators (thus making the degree 2 generators 
central). Finally, dropping the degree-rank to (3,0) eliminates the degree 3 gener- 
ators completely, and indeed G^ Dl ' D ^ D3)x ^ ^ is isomorphic to G^ 1 ' ^'^ 2 ' 2 ^. 

Example 9.5. The free s-step nilpotent group on D generators, in our notation, be- 
comes G^ 13 ' '"'' )'^*' 8 ). We may thus view the universal nilpotent groups G D, ^^ d ' r ^ 
as generalisations of the free nilpotent groups, in which some of the generators are 
allowed to be weighted to have degrees greater than 1, and there is an additional 
rank parameter to cut down some of the top-order behaviour. 

It will be an easy matter to lift a nilcharacter x from a general degree-rank 
^ (s — 1, r*) nilmanifold G/T to a universal nilmanifold G D /T D for some sufficiently 
large dimension vector D (see Lemma \9. 121 below) . Once one does so, we will need 
to extract the various "top order frequencies" present in that nilcharacter. For 
instance, if s = 4 and x is (some vector-valued smoothing of) the degree 3 phase 

n I— > e({an} f3n 2 + 771 3 + Sn 2 + {en}/in + vn + 9) 

then we will need to extract out the "degree 3" frequency 7, the "degree 2" frequency 
P, and the "degree 1" frequency a. (The remaining parameters 5, e, v, 6 only 
contribute to terms of degree strictly less than 3, and will not need to be extracted.) 
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As it turns out, the degree i frequencies will most naturally live in the i th hor- 
izontal torus of the relevant universal nilmanifold; we now pause to define these 
torii precisely. (These torii also implicitly appeared in [271 Appendix A].) 

Definition 9.6 (Horizontal Taylor coefficients). Let G — (G, (G(d.r))(d.r)eDR,) be 
a degree-rank-filtcrcd nilpotent group. For every i ^ 0, define the i th horizontal 
space Horiz.j(G) to be the abelian group 

Horizj(G) := G (i) i)/G( i)2 ), 

with the convention that Grj r ) := G(^ +1 ) if r > d (so in particular, G( 12 ) = 

G(2,0))- 

For any polynomial map g £ poly(Zjij — > Gn), we define the i th horizontal Taylor 
coefficient Taylor^g) £ Horiz;(G) to be the quantity 

Taylor, (g) := di . . . dig(n) mod G (i)2 ) 

for any n £ Z. Note that this map is well-defined since d\ . . . dig takes values in 
G(i,i) an d has first derivatives in G(, +11 ) and hence in G^ i 2 )- 
If r is a subgroup of G, we define 

Horizi(G/r) := Horiz i (G)/Horiz J (r) 

and for a polynomial orbit O £ poly(Zfj — > (G/T)m) := poly(Zpj — > GN)/poly(Zpj — >• 
Tn), we define the i th horizontal Taylor coefficient Taylor^O) £ Horizi(G/r) to be 
the quantity defined by 

Taylor^r) := Taylor,; (g) mod Horiz,(r) 

for any g £ poly(ZN — > Gn); it is easy to see that this quantity is well-defined. 

These concepts extend to the ultralimit setting in the obvious manner; thus 
for instance, if O £ *poly(iJN — > (G/T)n), then Taylor, (0) is an element to 
*Horiz,(G/r). 

If G/r is a degree-rank filtered nilmanifold, it is easy to see that the horizon- 
tal spaces Horizj(G) are abelian Lie groups, and that Horiz,(L) is a sublattice of 
Horiz^G), so Horiz^G/r) is a torus, which we call the i th horizontal torus of G/r. 

Remark. The above definition can be generalised by replacing the domain Z 
with an arbitrary additive group H = (H,+). In that case, the Taylor coefficient 
Taylorj(g) is not a single element of Horizj(G), but is instead a map Taylor^) : 
H % — > HoriZi(G) defined by the formula 

Taylor i (p)(/ii, . . . , h k ) := d hl ■ ■ ■ d hk g(n) mod G (i;2 ) 

for hi, ■ ■ ■ , hk £ H. Using Corollary IB. 71 we easily see that this map is symmetric 
and multilinear; thus for instance when H = Z we have 

Taylor; (si)(/ii, • • • , hk) = hi . . . /ifcTaylor^). 

However, we will not need this generalisation here. 

A further application of Corollary IB. 71 shows that the map g H> Taylor^g) is a 
homomorphism. As a corollary, we see that any translate g{- + h) = (dhg)g of g 
will have the same Taylor coefficients as g: Taylor i (g(- + h)) = Taylor^). 
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Example 9.7. Consider the unit circle G/T = T with the degree ^ d filtration 
(see Example l4.3[) . Then the d th horizontal torus is T, and all other horizontal tori 
are trivial. If oq, . . . , ad G *R, then the map O : n h- > ao + • • ■ + otdn d mod 1 is 
a polynomial orbit in *poly(Zfj — > Tn), and the d th horizontal Taylor coefficient is 
the quantity dlctd mod 1 from *Z d to *T. (All other horizontal Taylor coefficients 
are of course trivial.) Thus we see that the horizontal coefficient captures most of 
the top order coefficient ad, but totally ignores all lower order terms. 

Example 9.8. Let G = G^ 2 '^ = G (2,1) '^ 2 ' 2 ) be the universal nilpotent group of 
degree-rank (2, 2). Thus G is generated by ei,i, ei, 2 , £2,1, with relations 

[[ei,i, ei )2 ], ei,i] = [e M , e 2 ,i] = 1 for i = 1, 2. 

and with the degree-rank filtration 

G(o,o) — ^(1,0) — ^(i,i) = G 

C(2,o) = C( 2 ,i) = ([ei,i, ei i2 ], e 2i i)]s 
G(2,2) = ([ei,i, ei, 2 ])K 

and the lattice 

r = r (2,2) =r (2,2)^(2,l) ;= (eMieii2ie2>l) . 

Let a, /3, 7 € *R, and consider the orbit O G *poly(Z N — >• (G/T)n) defined by the 
formula 

0(n) := eHe^e^; 
this is polynomial by Example 14.41 Then 

Taylor! (g) = dig{n) mod *G (2:0 ) = e^ef 2 mod *G (2>0) , 

and 

Taylor 2 (#) = e 2 ^ mod *G (2>2) . 
Then Taylor ( 5 (n)*L) = ff (n)T, 

Taylor^ ( 5 *L) = e^e^ mod G (2fi) *T 

and 

Taylor 2 (,g*L) = e 2 ,^ mod *G i2 :2)^(2fi)- 

Example 9.9. Let G/T be the Heisenberg nilmanifold (|6.1[) with the lower cen- 
tral series filtration. Thus G/T is a degree ^ 2 nilmanifold, which can then be 
viewed as a degree-rank ^ (2, 2) nilmanifold by Example 16.111 The first horizon- 
tal torus Horizi(G/r) is isomorphic to the 2-torus T 2 , with generators given by 
ei,e 2 mod G 2 r. The second horizontal torus Horiz2 {G/T) is trivial, since G( 2 ,i) — 
[G, G] is equal to G( 2 ,o) = G 2 . If O G *poly(Z N (G/L) N ) is the polynomial orbit 
O : n i-» e 2 n e" n *T, then the first Taylor coefficient is the quantity (a, /3). Note also 
that if one modified the polynomial orbit by a further factor of [ei,e 2 ] in2+Sn+e , 
this would not impact the Taylor coefficients at all. Thus we see that the Taylor 
coefficients only capture the frequencies associated to raw generators such as e\ and 
e 2 , and not to commutators such as [ei,e 2 ]. 

Example 9.10. Now consider the Heisenberg group (|6.ip with the filtration used 
in Example [62T] to model the sequence (|6.9p . This is now a degree ^ 3 nilmanifold, 
whose first horizontal torus Horizi(G/T) is isomorphic to the one-torus T with 
generator e 2 mod G( 2 mT, whose second horizontal torus Horiz2(G/T) is isomorphic 
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to the one-torus T with generator e% mod G(2,2)F(2,i)j and whose third horizontal 
torus Horiz3(G/r) is trivial. If O G *poly(ZN — > (G/T)®) is the polynomial orbit 
O : n i-> e^ef" *T, then the first Taylor coefficient is the linear limit map n i— > 
/3n mod 1, and the second Taylor coefficient is the quantity 2\a mod 1. 

We now have enough notation to be able to formally assign frequencies to a 
nilcharacter, by means of a package of data which we shall call a representation. 

Definition 9.11 (Representation). Let \ £ L°°[N] be a nilcharacter of degree-rank 
^ (s — l,r»). A representation of x is a collection of the following data: 

(i) A filtered nilmanifold G/T of degree-rank ^ (s — l,r*); 

(ii) A filtered nilmanifold Gq/Tq of degree-rank ^ (s — l,r* — 1); 

(iii) A function F G Lip(*(G/r x G /T Q ) -> S^); 

(iv) Polynomial orbits G *poly(Z N ->• (G/F) N ) and O G *poly(Z N -> 

(G /r )N); 

(v) A dimension vector L> = (D ls . . . , D s _i) G N s_1 ; 

(vi) A universal vertical frequency r\ : G9 s _-y r \ — >• R at dimension £> on the 
universal nilmanifold G D /T D of degree-rank (s — 1, r*); 

(vii) A filtered homomorphism : G D /T D G/T (see Definition 16. 15ft : 

(viii) A horizontal frequency vector T = (Ci,j)i^z^s-i ; i^j^D, of frequencies 

which obeys the following properties: 

(i) For all n G [N], one has 

X (n)=F(O(n),O (n)). (9.4) 

(ii) For every t G G^_ 1 all a; G G/r, and a; G Go/r , one has 

F(<j>(t)x, xo) = e( V (t))F(x, x ). (9.5) 

(iii) For every 1 ^ i ^ s — 1, one has 

Taylor, (O) = ir RmizdG/r) L(J[ e^f ) j , (9.6) 

where 7rH riz;(G/r) : Gi ~^ Horizi (G/L) is the projection map; observe that 
the right-hand side is well-defined even though £y is only defined modulo 
1. 

We call the triplet (D, n) a total frequency representation of the nilcharacter \. 

This is a rather complicated definition, and we now illustrate it with a number 
of examples. We begin with the s — 2, r* = 1 case, taking \ t° be the degree-rank 
(1,1) nilcharacter 

x(n) :— e(£n + 9) 

for some £, 6 G *R. Let D x ^ 1 be an integer, let J 7 = (£ M , . . . G *T Dl be 

a collection of frequencies, and let r\ : R Dl — > K be the universal vertical frequency 
rj(xx, . . . , xd-l) '■= axXi + . . . + aD 1 xjj 1 for some integers a±, . . . , G Z. Then 
((.Di), J 7 , 77) will be a total frequency representation of x if C — a i£i.i+. . ■+cid 1 £,i,d 1 ■ 
Indeed, in that case, one can take G/T = T (with the degree-rank ^ (1, 1) filtration, 
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see Example l6.12p . Gq/Tq to be trivial, F equal to the exponential function (x, ()) i-> 
e(x), <f> '■ TT Dl — > T to be the filtered homomorphism 

4>{xi, . . . ,x Dl ) := aixi + . . . + a Dl x Dl , 

and O € *poly(ZN — ► Tn) to be the orbit n H ► £n + 9 mod 1. This should be 
compared with (|9.3p and the discussion at the start of the section. 

For a slightly more complicated example, we take s = 3, r* = 1, and let x be the 
degree-rank (2,1) nilcharacter 

x(n) ■= e(an 2 + j3n + 7). 

We let D 2 > 1 be an integer, set £>i := 0, let T = ((), (£ 2) i, ■ • -,&,d 2 )) £ *T° x 
*T D2 be a collection of frequencies, and let n : R D2 — > K be the universal vertical 
frequency 77(2:1, . . . , Xd 2 ) '■= ai%i + ■ ■ ■ + cld 2 x d 2 f° r some integers a\, . . . , Od 2 eZ. 
Then ((0, -D2), f?) will be a total frequency representation of x if £ = ai^2.i + • • • + 
aD 2 ^2,D 2 ( CI - Q9-3[Q . Indeed, we can take G/T = T with the degree-rank < (2, 1) 
filtration (see Example I6.12p . Gq/T = T with the degree-rank ^ (1, 1) filtration, 
the orbit 

0(n) :— (an 2 mod 1, fin + 7 mod 1) 

and F : G/T x G /T S" 1 to be the function 

F(x,y) := e(x)e(y), 

and </> : T D2 — > T to be the filtered homomorphism 

0(o:i, . . . jiEoJ := ai^i + . . . + a Dl x Dl . 

Note how the lower order terms f3 n + 7 in the phase of x are shunted off to the lower 
degree-rank nilmanifold Gq/Tq and thus do not interact at all with the data J 7 , n. 
In this particular case, this shunting off was unnecessary, and one could have easily 
folded these lower order terms into the dynamics of the primary nilmanifold G/T; 
but in the next example we give, the lower order behaviour does genuinely need to 
be separated from the top order behaviour by placing it in a separate nilmanifold. 

We now turn to a genuinely non-abelian example of a universal representation. 
For this, we take s = 3, r* = 2, and let x be a degree-rank (2, 2) nilcharacter that 
is a suitable vector-valued smoothing of the bracket polynomial phase 

n 1— > e({an}(3n + "]n 2 ). 
We can express this nilcharacter as 

x(n)=F(O(n),O (n)), 
where O G *poly(Zf*[ — >• (G/T)-^) is the orbit 

0(n) := ef n ef B r 

into the Heisenberg nilmanifold (I6.1[) (which we give the degree-rank ^ (2, 2) filtra- 
tion), O € *poly(Z N ->■ (G/r) N ) is the orbit 

Oq(u) :— "fn 2 mod 1 

into the unit circle Gq/Tq = T (which we give the degree-rank ^ (2, 1) filtration, 
see Example 16. 12[) , and F is a suitable vector- valued smoothing of the map 

(e t 1 1 e* 2 [e 1 ,e 2 ] tl2 r, 2 ;)^e(t 12 )e(y) 

for ti,<2,^i2 £ ^o- By Examplc l9.91 we have Taylor 1 (C) = (a mod 1,(3 mod 1) and 
Taylor 2 (0) is trivial. 
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Now let D\ ^ 1 be an integer, set D2 := 0, let T = ((£1,1, ■ ■ ■ ,£i,r>i)> ()) 6 
*T Dl x *T° be a collection of frequencies. The subgroup °f the universal 

nilmanifold G^ Dl ' ' = G^ Dl ' ''^ 2 ' 2 ' is then the abelian Lie group generated by the 
commutators [ei^, eij] for 1 ^ i < j ^ D\. We let 01, ...,a,Di_, b\, br>^ G Z 
be integers, and let cj) : G^ 1 ' )/^ 1 ' ) -)• G/T be the filtered homomorphism that 
maps ei.; to e" < e 2 < for i = 1, . . . , Z?i, thus 

i=l l^i<j^Di 

= n( e i le 2 1 ) 4, II [efe^e^-r 

i=l l^i<j^Di 

= ep^ 1 a " ti ef , '= 1 blti [ ei , e 2 ]~^^ liaibl ( t2, )~^ ls;i<J '< £!fciaj '* lt3+ ^ 1 ^ l< ^ d(aib3 ~ a3bi)tl ' j r. 

Let us now see what conditions are required for ((-Di, 0), 77, J 7 ) to be a total frequency 
representation of x- The condition (|9.6[) becomes the constraints 

i=l 
i=i 

while the condition (I9.5|) becomes 

»?([ei,i, ei,j]) = aj6j - a,&j (9.7) 
for all 1 ^ i < j ^ Z?i , or equivalently 

»7( II [ e M> e ij] tia ) = XI {aibj - ajbi)ti }j 

Conversely, with these constraints we obtain a total frequency representation of \ 
by ((£>i, 0), r], J 7 ). This should be compared with the heuristic (|9.3p . (Note from 
(|6.4[) that the top order component {an}/3n of \ is morally anti-symmetric in. a,/3 
modulo lower order terms, which is consistent with the anti-symmetry observed in 
(|9.7[) .) Note also that the term jn 2 , which has lesser degree-rank than the top order 
term {an}/3n, plays no role, due to it being shunted off to the lower degree-rank 
nilmanifold Go /Tq. If instead we placed this term as part of the principal nilmani- 
fold, then this would create a non-trivial second Taylor coefficient Taylor 2 (C) which 
would then require a non-zero value of D2 in order to recover a total frequency rep- 
resentation. Thus we see that in order to neglect terms of lesser degree-rank (but 
equal degree) it is necessary to create the secondary nilmanifold Go /To as a sort of 
"junk nilmanifold" to hold all such terms. 

We make the easy remark that every nilcharacter \ of degree-rank ^ (s — 1, r*) 
has at least one representation. 

Lemma 9.12 (Existence of representation). Let x be a nilcharacter of degree-rank 
(s — l,r*) on [N]. Then there exists at least one total frequency representation 
(D,T,T]) ofx- 
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Proof. By definition, x — F ° @ f° r some degree-rank ^ (s — 1,7%) nilmanifold 
G/T, some O G *poly(Z N -> (G/r) N ), and some F G Lip(*(G/T)) with a vertical 
frequency. For each 1 ^ i ^ s — 1, let /^i, . . . be a basis of generators for F^, 

and let Z? := (Lq, . . . , -D s _i) be the associated dimension vector. Fhen we have a 
filtered homomorphism 4> : G D — ► G which maps eij to fij for all 1 i ^ s — 1 
and 1 < j < Di. It is easy to see that <f> is surjective from Gf to Gi for each i, and 
so the map 7THorizi(G/r) ° is surjective from Gf to Horizi(G/T). It is now an easy 
matter to locate frequencies £i j obeying (|9.6[) . and the vertical frequency property 
of F can be pulled back via <f> to give (|9.5|) . Setting Go /T to be trivial, we obtain 
the claim. □ 

To conclude this section, we now give some basic facts about total frequency 
representations. These facts will not actually be used in this paper, but may serve 
to consolidate one's intuition about the nature of these representations. We first 
observe some linearity in the vertical frequency rj. 

Lemma 9.13 (Linearity). Suppose that XiX' are ^ wo nilcharacters of degree-rank 
(s — 1,7%) on [N] that have total frequency representations (D,T,rf) and (D, T, tj') 
respectively. Then \ has a total frequency representation (D,F, —rj), and x® x' 
has a total frequency representation (D, J 7 , 77 + 77'). 

Proof. This is a routine matter of chasing down the definitions, and noting that 
nilmanifolds, polynomial orbits, etc. behave well with respect to direct sums. □ 

Lemma 9.14 (Triviality). Suppose thatx is a nilcharacter of degree-rank (s — 1,7%) 
on [N] that has a total frequency representation (D,J-,0). Then x is a nilsequence 
of degree-rank ^ (s - l,r% - 1) (i.e. [x] Symh ^-^) m) = 0). 

Proof. By construction, we have 

x (n) = F(O(n) > O (n)) 

for some limit polynomial orbits O <G *poly(ZN — >■ (G/F)n), Co G *poly(ZN — > 
(Gq/T )n) into filtered nilmanifolds G/r,Go/r of degree-rank ^ (s — 1,7%) and 
< (s — 1, r% — 1) respectively, where F € Lip(*(G/T x Go/r ) -> S u ). Furthermore, 
there exists a filtered homomorphism : G D /T D G/T such that (|9.6[) holds, and 
such that 

F(cj>(t)x,x )=F(x,x ). (9.8) 

foraUtGG!g_ lir0 . 

Let T be the closure of the set {(f>(t) mod F( s _ l r< _) : t G G®_ lr ^}; this is a 
subtorus of the torus Gr a _i r ^-\/Tr a _i r A, and thus acts on G/T. As F is continuous 
and obeys the invariance (|9.8p . we see that F is T- invariant; we may thus quotient 
out by T and assume that T is trivial. In particular, <f> now annihilates G® s _ 1 r y 

We give G a new degree-rank filtration (G', d r p(d,r)eDR (smaller than the existing 
filtration (G(d,r))(d,r)eDR)> by defining G', d > to be the connected subgroup of G 
generated by G^,r+i) (recalling the convention Gu,r) := G(d+i,o) when r > d) 
together with the image 4>(GP d ^ ) of G? d r % . It is easy to see that this is still a 
filtration, and that G/T remains a filtered nilmanifold with this filtration, but now 
the degree-rank is ^ (s — 1,7% — 1) rather than ^ (s — 1,7%). Furthermore, from 
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()9.6|) we see that O is still a polynomial orbit with respect to this new filtration. 
As such, x is a nilsequence of degree-rank $C (s — 1 , r* — 1) as required. □ 

Combining the above two lemmas we obtain the following corollary. 

Corollary 9.15 (Representation determines symbol). Suppose that XiX' are t wo 
nilcharacters of degree-rank (s — l,r*) on [N] that have a common total frequency 
representation (D,!F, rf). Then X-iX' are equivalent. In other words, the symbol 
[xIhI'-i^.w*]) depends only on (D, T, 77). 

Note that the above results are consistent with the heuristic (|9.3|) . 

10. Linear independence and the sunflower lemma 

A basic fact of linear algebra is that every finitely generated vector space is finite- 
dimensional. In particular, if vi, . . . , vi are a finite collection of vectors in a vector 
space V over a field k, then there exists a finite linearly independent set of vectors 
v[ , . . . , v'p in V such that each of the vectors v± , . . . , V\ is a linear combination (over 
k) of the v'i,...,v'i,. Indeed, one can take v[, . . . , v[, to be a set of vectors generating 
v 1 , . . . , vi for which I' is minimal, since any linear relation amongst the v[, . . . , v[, 
can be used to decreas^ the "rank" I', contradicting minimality (cf. the proof of 
classical Steinitz exchange lemma in linear algebra). 

We will need analogues of this type of fact for frequencies £1, . . . ,£/ in the limit 
unit circle * T. However, this space is not a vector space over a field, but is merely 
a module over a commutative ring Z. As such, the direct analogue of the above 
statement fails; indeed, any standard rational in *T, such as \ mod 1, clearly cannot 
be represented as a linear combination (over Z) of a finite collection of frequencies 
in *T that are linearly independent over Z. 

However, the standard rationals are the only obstruction to the above statement 
being true. More precisely, we have 

Lemma 10.1 (Baby regularity lemma). Let I £ N, and let £ *T. Then 

there exists I', I" £ N and ■ • ■ , £,">> S *T such that £{/ are lin- 

early independent over Z (i.e. there exist no standard integers ai,...,a;/, not all 
zero, such that ai£[ + . . . + = 0), each of the are rational (i.e. they live 
in Q mod 1), and each of the £1,. • are linear combinations (over Z) of the 

Proof. Fix I, £1, . . . , £/. Define a partial solution to be a collection of objects V ', 

...,£_[, , £x'j.-- satisfying all of the required properties, except possibly for 
the linear independence of the Clearly at least one partial solution 

exists, since one can take I' := I, I" :— 0, and £■ := £j for all 1 ^ i ^ L Now let 
I', I", £[, . . . , £1 , . . . , £"/ be a partial solution for which I' is minimal. We claim 
that £,[,■■■ ,Ci/ is linearly independent over Z, which will give the lemma. To see 
this, suppose for contradiction that there existed a\, . . . , ap £ Z, not all zero, such 



^Indeed, one can recast this argument as a rank reduction argument instead of a minimal rank 
argument, for the same reason that the principle of infinite descent is logically equivalent to the 
well-ordering principle. In this infinitary (ultralimit) setting, there is very little distinction between 
the two approaches, although the minimality approach allows for slightly more compact notation 
and proofs. But in the finitary setting, it becomes significantly more difficult to implement the 
minimality approach, and the rank reduction approach becomes preferable. See 28 for finitary 
"rank reduction" style arguments analogous to those given here. 
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that ai£[ + . . . + ai't^i = 0. Without loss of generality we may assume that a\ is 
non-zero. For each 2 ^ j ^ I', let ^ 6 *T be such that — £j. We then have 

for some standard rational q £ Q. If we then replace . . . , by £ 2I . . . , (decre- 
menting V to V — 1) and append q to . . . then we obtain a new partial 
solution with a smaller value of I', contradicting minimality. The claim follows. □ 

This lemma is too simplistic for our applications, and we will need to modify it 
in a number of ways. The first is to introduce an error term. 

Definition 10.2 (Linear independence). Let e > be a limit real, and let I £ N. A 

set of frequencies £i, . . . , £j £ *T is said to be independent modulo 0(e) if there do 
not exist any collection a\, . . . , ai £ Z of standard integers, not all zero, for which 

ai£i + . . . + a;£; = 0(e) mod 1 

(Thus, for instance, the empty set (with k = 0) is trivially independent modulo 
0(e).) Equivalently, £i, . . . , £j are linearly independent over Z after quotienting out 
by the subgroup eR mod 1. 

This definition is only non-trivial when e is an infinitesimal (i.e. e = o(l)). In 
practice, e will be a negative power of the unbounded integer N. 
We have the following variant of Lemma 110.11 

Lemma 10.3 (Regularising one collection of frequencies). Let I £ N, let £i, . . . , £/ 6 

*T, and let e > be a limit real. Then there exist I', I", V" £ N and 

(I (I (II (II (III (III c- *Tp 

SlJ • • • ! S.V 1 SI J • • • ! Si" ) SI ) * * ■ ) Si'" t 1 

smc/i i/iai £ij .. . , are linearly independent modulo 0(e), each of the are ra- 
tional, each of the are O(e), and eac/i of the £i,---,£i are linear combinations 
( over Z) of the £ , . . . , & , , . . . , , tf' , • • • , • 

One can view Lemma 110.11 as the degenerate case e = of the above lemma. 

Proof. We repeat the proof of Lemma ll0.ll Define a partial solution to be a collec- 
tion of objects V, I", I", £[, . . . , , . . . , . . . , obeying all the required 
properties except possibly for the linear independence property. Again it is clear 
that at least one partial solution exists, so we may find a partial solution for which 
V is minimal. We claim that this is a complete solution. For if this is not the case, 
we have 

aifi + . . . + a v g v = 0(e) mod 1 
for some oi, . . . , ay £ Z, not all zero. Again, we may assume that a± ^ 0. We again 
select £,2i - ■ ■ i^'v ^ *^ with ai^ = ^ for all 2 ^ j ^ I', and observe that 

i' 

for some standard rational g 6 Q and some s = O(e). If we then replace £[,..., £[, 

by 4^2i ■ • ■ > Cj'i an< i append g and s to £i , • • • f £"/ and • • • , Ci''' respectively, we 
contradict minimality, and the claim follows. □ 
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This lemma is still far too simplistic for our needs, because we will not be need- 
ing to regularise just one collection £1, . of frequencies, but a whole family 
£ft,l> • • • 5 £h,l of frequencies, where h ranges over a parameter set H. Such frequen- 
cies can exhibit a range of behaviour in h; at one extreme, they might be completely 
independent of h, while at the other extreme, the frequencies may vary substan- 
tially as h does. It turns out that in some sense, the general case is a combination 
of these extreme cases. 

In this direction we have the following stronger version of Lemma 110.31 

Lemma 10.4 (Regularising many collections of frequencies). Let I G N, let e > 

be a limit real, let H be a limit finite set, and for each h G H, let • ■ • ,Ch,l oe 
frequencies in *T that depend in a limit fashion on h. Then there exists a dense 
subset H' of H , standard natural numbers, I*, I', I", I'" G N, "core" frequencies 
. . . , £*,/„ , . . . , Ci" G *T ; and "petal" frequencies 

■ • ■ i £v)£/i,i> • • • ) £h,l"' e *T 
for each h G H' depending in a limit fashion on h, and obeying the following 
properties: 

(i) (Independence) For almost all triples (hi,h 2l h 3 ) G (H 1 ) 3 (i.e. for all but 
o(|if'| 3 ) such triples) , the frequencies 

• ■ • i £*,/* , £/ii,i> • ■ • j £/u,2''£/i2,i' ■ • ■ ' £w'£ha,i' • ■ • ' £ft 3 ,z' 

are linearly independent modulo 0(e). 

(ii) (Rationality) For each 1 ^ j ^ I" , j is a standard rational. 
(hi) (Smallness) For each h G H' and 1 < j < Z'", ^ = 0(e). 

(iv) (Representation) For eac/i h G ff', £/ie . . . , £h,l are linear combina- 
tions over Z of the frequencies 

c C C 1 (' c" c" c'" c'" 

■ • ■ is*,(.)sh,ii • ■ • !sh,i'is*,i! • ■ • • ■ • !?*,("'■ 

Note that Lemma riO-41 collapses to Lemma 110.31 if iJ is a singleton set. 

Proof. We again use the usual argument. Define a partial solution to be a collec- 
tion of objects H', Z*, Z', I", I'", £,*j, £,'h^^*.ji^h.j obeying all the required properties 
except possibly for the independence property. Again, at least one partial solution 
exists, since we may take H' := H, Z* := I" := V" := 0, V := I, and & h :— ^j : h for all 
h G H and 1 ^ j ^ I. We may thus select a partial solution for which V is minimal; 
and among all such partial solutions with V minimal, we choose a solution with Z* 
minimal for fixed V (i.e. we minimise with respect to the lexicographical ordering 
on V and Z*). We claim that this doubly minimal solution obeys the independence 
property, which would give the claim. 

Suppose the independence property fails. Carefully negating the quantifiers and 
using Lemma lA.9[ we conclude that there exist standard integers a* i3 - for 1 j ^ Z* 
and a'tj for i = 1, 2, 3 and 1 ^ j ^ I', not all zero, such that one has the relation 

3 I' 

a*,i€*,i + ■■■ + + X] X! a i,j€'hi,j = °( £ ) mod 1 

i=l j=l 

for many triples (Zii, /12, Z13) G [H 1 ) 3 . 

Suppose first that all of the a[ = vanish, so that we have a linear relation 

+ • ■ ■ + a*,i£*,u — 0(e) mod 1 
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that only involves core frequencies. Then the situation is basically the same as that 
of Lemma 110.3) without loss of generality we may take a*,i 7^ 0, and if we then 
choose £^2, • • • ,£*,;, so that a* t i£*j = then we can rewrite 

1' 

= - ^ a *,ji*,j + q + s mod 1 

J=2 

for some q G Q and s = 0(e), and one can then replace the £*,i, . . . , £*./„ with 
£*,2i • • • , (decrementing Z* by 1) and append q and s to each of the collec- 
tions Cfc u • • • ) /» an d £fc'i> • ■ • , respectively for each h £ H, contradicting 
minimality. 

Now suppose that not all of the a[ a vanish; without loss of generality we may 
assume that j is non-zero. By the pigeonhole principple, we can find h 2 ,h^ G H' 
such that 

3 /' 

a*,i£*,i + • • • + a*, + X/X/ a ^'^iJ = moc * 1 

for all /i! in a dense subset ii 7 ' of W . Now let G *T for 1 < j < Z» and |^ G *T 

for /ii G W and 1 < j < Z' be such that a[ tl £*j = £*j and a'ii£' h ,• = C/ij i then we 
have 

!' j, 3 1' 

&i,i = " H a 'i J^i J ~ H a *.J'^*.J ~ mi "'•/'•/ + 9hi + s hl mod 0(1) 

j=2 3 = 1 1=2 j=l 

for some standard rational qh t and some Sh 1 = 0(e). Furthermore one can easily 
ensure that qh 1 , depend in a limit fashion on hi. By Lemma lA. 91 (and refining 
H') we may assume that qh t = q* is independent of hi. We may thus replace 
H' by H" and replace £' h x , . . . , £' h v by £' h 2 , . . . , £' h v (decrementing V by 1), while 
appending q* and Sh to . . . ,C* /" and d"i! ■ ■ • 1^,'hl'" respectively, and replac- 
ing ■ ■ • ,£*,/, by |*,i,...,|*,;,,^ 2il ,...,|/ l2 ,p,^ 3il ,...,^3,/' (incrementing Z* 
as necessary). This contradicts the minimality of the partial solution, and the 
claim follows. □ 

This is still too simplistic for our applications, as the independence hypothesis 
on triples (hi,h2,ha) will not quite be strong enough to give everything we need. 
Ideally, (in view of Proposition 17. 3p we would like to have independence of the 
£*,ii ■ • • j £*,z» 1 £'h l' • • • ' £fu /' f° r almost all additive quadruples hi + hi = Z13 + /14 in 
H'. Unfortunately, this need not be the case; indeed, if the original £h,i are linear 
in h, say ^ L ,i — otih for some ai G *T and all 1 ^ i ^ I', then we have £hi,i +£,h 2 ,i — 
£h 3 ,i + S,h 4 .i for all additive quadruples hi + h 2 = h$ + hi in H' and all 1 ^ i < I', 
and as a consequence it is not possible to obtain a decomposition as in Lemma [10.41 
with the stronger independence property mentioned above. A similar obstruction 
occurs if the £h t i are frracfcei-linear in h, for instance if — {&ih}/3i mod 1 for 
some at G *T and ft G *R. 

By using tools from additive combinatorics, we can show that bracket-linear 
frequencies are the only obstructions to independence on additive quadruples. More 
precisely, we have 
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Lemma 10.5. Let I £ N, let e > be a limit real, let H be a dense limit subset 
of [[N]], and for each h G H, let ■ ■ ■ >£h,l be frequencies in *T that depend in 
a limit fashion on h. Then there exists a dense subset H' of H, standard natural 
numbers, G N, "core" frequencies . . £',n . . £ *T, 

and "petal" frequencies & iX , . . . , f^,,^, . . . , ^'M'^, G *T /or eac/i 

h <E H' depending in a limit fashion on h, obeying the following properties: 

(i) (Independence) For almost all additive quadruples hi + h>2 = /13 + /14 in 
iF fie. /or aZZ owi o(|iF| 3 ) suc/i quadruples), the frequencies £*j /or 
1 < i < ^ j for i = 1, 2, 3,4 and 1 O' < I', and ^"'j / or * = 1. 2, 3 
and 1 < ,7 < are jointly linearly independent modulo 0(e). 

(ii) (Rationality) For each 1 ^ j ^ I", • is a standard rational. 

(iii) (Smallness) For eac/i h G H' and 1 < j < Z'", ^ = 0(e). 

(iv) (Bracket-linearity) For eac/i 1 < j < i"", i/iere exisi ay £ *T and /3j G *R 
suc/i that = {ajh}f3j mod 1 /or aH h E H' . Furthermore, the map 
h 1 — y is a Freiman homomorphism on H' (see J2/or the definition of 
a Freiman homomorphism). 

(v) (Representation) For eac/i h G iF, i/ie • • • , are linear combina- 
tions over Z of the frequencies 

c c c' d f" e" t'" d" t"" t"" 

• • • > S*,i„> sft,i) ■ • • ! S/v,;'! • ■ • !S*,i")Cfc,i) • ■ • i • • • >?fc,j""> 

Proof. As usual, we define a partial solution to be a collection of objects iF, 
I*, I', I", I'", I"", . . . obeying all of the required properties except pos- 

sibly for the independence property. Again, there is clearly at least one partial 
solution, so we select a partial solution with a minimal value of V , and then (for 
fixed I') a minimal value of I"", and then (for fixed I', I"") a minimal value of i*. 
We claim that this partial solution obeys the independence property, which will 
give the lemma. 

Suppose for contradiction that this were not the case; then by Lemma [A.91 there 
exist standard integers a*j for 1 ^ j ^ ?*, a'^ for 1 ^ i ^ 4 and 1 ^ j ^ and 
for 1 $C i < 3 and 1 < j < I"", not all zero, such that 

E <**,^ +EE <A j • EE <"Cj = mod 1 

for many additive quadruples ni + hi = /13 + in iF . 

Suppose first that all the a- • and a"" vanished. Then we have a relation 

E a *,j^*j = 0(e) mod 1 

that only involves core frequencies; arguing as in Lemma 110.41 we can thus find 
another partial solution with a smaller value of h (and the same value of I', I""), 
contradicting minimality. 

Next, suppose that the a[ j all vanished, but the a"" did not all vanish. Then 
we have a relation 

l, 3 1"" 

E a *,d*j + E E a X'j = °( £ ) mod 1 ( 1(U ) 
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for many triples h\,h,2, h 3 in H' . 

Without loss of generality let us suppose that a'{'[ is non-zero. By the pigeonhole 
principle, we may find h 2 ,h 3 € H' such that (|10.1[) holds for all hi in a dense subset 
H" of H' . As in previous arguments, we then find j 6 *T such that a""^* j = £*j 
for each 1 < j ^ I*, and also find fij € *M such that a""/3j = for all f < j ^ I"". 
If we then set ^ := {ajh}^ for each h E H' and 1 sC j ^ I"", then a^J&'JJ = 
and so for any h\ G H' we have 

Ci = - E - E a uC, ~EE a ""C, + ®* + s 'h mod 1 

j=l j=2 f=2 j=l 

for some standard rational <7/ H and some — 0(e), both depending on a limit 
fashion on hi- By refining if' if necessary (and using the bracket-linear nature of 
the Ch'j) we ma y assume that the map h H ► is a Freiman homomorphism on 
if, and by Lemma lA.9l we may make g/jj = gr* independent of hi. If we then argue 
as in the proof of Lemma 110.41 we may find a new partial solution with a smaller 
value of I"" and the same value of I', contradicting minimality. 

Finally, suppose that the j did not all vanish. Using the Freiman homomor- 
phism property to permute the i indices if necessary, we may assume that a' 4 1 does 
not vanish. We then have 

Ri(hi) + Z 2 (h 2 ) + E 3 (h 3 ) + S 4 (ft 4 ) - 0(e) 

for many additive quadruples hi + h% = /13 + hi in if, where the limit functions 
H, : H — > *T are defined by 

E t (h) : E • E<"C modl 

i=l i=l 

for i = 1, 2, 3 and h £ H, and 

1, V 
34 (ft) := E a *,i£*.i + E °4,j£h,j mod L 

We can use this additive structure to "solve" for S4, using a result from additive 
combinatorics which we present here as Lemma IF. 11 Applying this lemma, we 
can then find a dense limit subset if of if a standard integer K 7 and frequencies 
a[, . . . , a' K , 6 e *T and /3[,..., /3' K e *R such that 

K 

s 4(ft) = ^{a' k h}P' k + S + 0(e) mod 1 
fc=i 

and thus 

k u V 

<,A,i = 52WkWk + s - E + E a '*,A,j + mod 1 

fe=l j = l j=2 

for all h E H'. 

As usual, we now find j3 k G *K for 1 < k < if, € *M for 1 ^ j s$ 8 6 T and 
for 1 ^ j < Z* such that a' 4 i$k = Pk, a 4i$j = Pj> a 4 ,i<5 = S, and a 4 = 
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We then set £' h • := {ajh}f3j mod 1, and we conclude that 

K U I' 

Ch.i = ^ZWk h )K + s - "■■z-j + H a ^J'hj + qh + Sh mod 1 

fe=l j=l j=2 

for all h G H', where qh G Q and Sh — 0(e) depend in a limit fashion on h. By 
refining H' we may take qh = q* independent of h. 

We can then use relation to build a new partial solution that decreases V by 1, 
at the expense of enlarging the other dimensions /*, I", I'", I"" (and also refining H 
to H'), again contradicting minimality, and the claim follows. □ 

We now apply the above lemma to the language of horizontal frequency vectors 
introduced in the previous section. We need some definitions: 

Definition 10.6 (Properties of horizontal frequency vectors). Let 

= (£i,j)l<iO-l;l^i<£>i an d J 7 = (£ij)l^i<s-l;l<js;.Dj 

be horizontal frequency vectors. 

• We say that T is independent if, for each 1 ^ i ^ d, the tuple i^j<£>i 
is independent modulo 0(iV~ l ). 

• We say that J- is rational if all the £jj are standard rationals. 

• We say that J- is small if one has = 0(N~ l ) for all 1 i ^ s — 1 and 

1 < j < A- 

• We define the disjoint union T\&JF' = {Ci t j)i^i^.s-i;i^.j^.Di+D'. by declaring 
£|j to equal £jj if j < £>i and £' i j_ Di if Di < j ^ Di + D^. This is clearly a 
horizontal frequency vector with dimensions (Di + D'^, . . . , A-i + A-i)- 

• We say that T is represented by J 7 ' if for every 1 ^ i ^ s— 1 and 1 < j < 

is a standard integer linear combination of the ^ jV for 1 ^ j' ^ D[. 

Lemma 10.7 (Sunflower lemma). Let H be a dense subset of[[N]], and let (J : h)heH 
be a family of horizontal frequency vectors depending in a limit fashion on h, whose 
dimension vector D — Dh is independent of h. Then we can find the following 
objects: 

• A dense subset H' of H ; 

• Dimension vectors £)* = Ac.ind + A, rat and D' = D[ in + D' ind + D' sml , 
which we write as — (.D*,i)i=i> A, ind = (A,ind,i)f=i > e<c v 

• A core horizontal frequency vector J 7 * = (^*,i,j)i<i<s-i;i<jX-D* tl which is 
partitioned as = T*^. W J-^.rat. with the indicated dimension vectors 
f)' j)' ■ 

• A petal horizontal frequency vector T' h = (£' h 4 j)i^.i^, s -i;i^.j^D'. , which is 
partitioned as T' h = T' h lin W T' h ind tfcl T' h sml , which is a limit function of h 
and with the indicated dimension vectors A'in' A'nd' Ami 

which obey the following properties: 

• For all h G H' , T' h sml are small. 

• rat *s rational. 

• For every 1 ^ i ^ d and 1 j ^ D[ lin , there exists ctij G *T and 

G *M such that (|10.2[) ZioZds /or all h G iZ', and furthermore that the 
map h t— ¥ (J h i j is a Freiman homomorphism on H' . 

• For all h G H, Th is represented by U T' h 
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• (Independence property) For almost all additive quadruples (hi, h,2,h,3, hi) 
in H , 

4 3 
t=l i=l 

is independent. 

Proof. Write J 7 /,. = (^^jJi^^s-i^^^D,. For each l<i^s-lin turn, apply 
Lemma 110.51 to the collections ■ ■ ■ ,£h,i,Di)heH and e = 0(N~ l ), refining 7? 

once for each i. The claim then follows by relabeling. □ 

To apply this lemma to families of nilcharacters, we will need two additional 
lemmas. 

Lemma 10.8 (Change of basis). Suppose that \ £ "dr* '****' (1-^1) * s a degree-rank 
(s — 1, r*) nilcharacter with a total frequency representation (D, F, n), and suppose 
that F is represented by another horizontal frequency vector F' with a dimension 
vector D' . Then there exists a vertical frequency rj 1 : G^ > _ 1 —> R such that \ has a 
total frequency representation (D' , F' , rf). 

Proof. By hypothesis, each element £jj of F can be expressed as a standard linear 
combination £ 4 j — X^Li c i-j,j'Ci j> °f elements ^ ■, of J 7 ' of the same degree, where 

Now let ^ : G D —> G D be the unique filtered homomorphism that maps e\ y to 
Yif=i e i*j 3 ' 3 ' (this can be viewed as an "adjoint" of the representation of F by J 7 '). 
By hypothesis, \ has a representation x(n) = F(0(n),Oo(n)) of \ with 

Taylorj(O) = 7r Horiz . (G/r) 

for some filtered homomorphism <f> : G D — > G. A brief calculation shows that the 
right-hand side can also be expressed as 

7THoriz s (G/r) ^(f) O VCllKi)^) 

As <j> o rp : G D — > G is a filtered homomorphism, and rj o rp : G^ s _ x r > — > R is a 
vertical frequency, we obtain the claim. □ 

Lemma 10.9. Let F be a horizontal frequency vector of dimension D of the form 

F = F tat W F sml W T' 

where F Ya t is rational and F sm i is small, and F' has dimension D' . Suppose that 
X 6 "DR l r *^([-^]) * s a nilcharacter with a total frequency representation (D,F,rj). 
Then there exists a vertical frequency rj' : G^ > _ 1 — » R such that \ has total frequency 
(jy ,F' /M,rj') for some standard integer M ^ 1. 

Remark. This lemma crucially relies on the hypothesis s ^ 3, as it makes the 
(degree 1) contributions of rational and small frequencies to be of lower order. 
Because of this, the inverse conjecture for s > 2 is in a very slight way a little bit 
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simpler than the s ^ 2 theory, though it is of course more complicated in many 
other ways. 

Proof. By induction we may assume that J- is formed from T' by adding a single 
frequency £,i ,D ig , which is either rational or small. 

Let us first suppose that we are adding a single frequency which is not just ratio- 
nal, but is in fact an integer. Then if x( n ) — F{9( n )*^ y 9o( n )*^o) is a nilcharacter 
with a total frequency representation (D, J-, rf), then we have a filtered homomor- 
phism <j) : G B /T B G/T such that 

Di 

9i = n ^( e »j) 6 " 3 m ° d g (m) 

3=1 

for all 1 ^ i ^ s* — 1, where gi are the Taylor coefficients of g. Specialising to the 
degree iq and using the integer nature of £,i ,D iQ , we have 

9i = 9i li 

where "fi is an element of 1^,, , and 

Di-l 

9i = n ^( e *j)^' 3 m ° d g (^)- 

3=1 

From this and the Baker-Campbell-Hausdorff formula p.2p . we can write g(n) — 

g'(n)^* a \ where g' is a polynomial sequence with a horizontal frequency repre- 
sentation (D 1 , 4>' , J 7 '), where D' is D with Di decremented by one, and <fr' is the 
restriction of (f> to the subnilmanifold G D /T D . Since g(n)*T = g'(n)*T, we see 
that x has a total frequency representation (D' , T' ,77'), where -q' is the restriction 
of r] : G^_ 1 r % — > R to G^_ 1 r y This gives the claim in this case (with M = 1). 

Now suppose that Cto,Di is merely rational rather than integer. Then we can 
argue as before, except that now 7 io is a rational element of Gi , so that 7™ G T io 
for some standard positive integer m. As such, there exists a standard positive 

integer q such that 7,- mod *T is periodic with period q. As a consequence, there 
exists a bounded index subgroup V of T such that the point 

9'( n hio mod * r 

in G/r can be expressed as a Lipschitz function of 

g'(n) mod T' 

and of the quantity n/q mod 1. Repeating the previous arguments, we thus obtain 
a total frequency representation (D' , J- 1 ,77') for some 77', and some T' whose coeffi- 
cients are rational combinations of those of J 7 '; note that the n/q dependence can 
be easily absorbed into the lower order term Gq/Tq since s ^ 3. The claim then 
follows from Lemma \1 0.81 

Finally, suppose that £,i .D io is small rather than rational. Then we can write 

9io ~ c io9i 
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where g' io is as before, and a £ d is at a distance 0(N l °) from the origin. Wc 
can thus write 

g(n) = c[pg\n) 

where g' is a polynomial sequence with horizontal frequency representation 

On [N], the sequence c^ 10 ^ is can be expressed as a bounded Lipschitz function of 
n/2N mod 1. As a consequence, we can thus write x i n the form 

X (n) = F'(g>(n)*r,g (n)*r ,n/2N mod 1) 

for some F' £ Lip(*(G/T x G /T x T)). As s > 3, the final term T can be absorbed 
into the degree-rank ^ (s — 1, — 1) nilmanifold Go/To, and the claim follows (with 
M=l). □ 

Finally, we can state the main result of this section. 

Lemma 10.10 (Sunflower lemma). Let H be a dense subset of [[N]], and let 

(Xh)heH be a family of nilcharacters Xh £ "DR 1 ' r *' l ([iV]) depending in a limit fash- 
ion on H . Then we can find 

(i) A dense subset H' of H; 

(ii) Dimension vectors and D' = D[ in + D' ind , which we write as = 
(-D*,i)l=i > D' — (D[) i=1 , D[[ n — (D[ in i ) i=1 , D' ind = (D' ind i ) i=1 ; 

(iii) A core horizontal frequency vector J 7 * = (£*,ij)i^i^d ; i<7^.D, a' 

(iv) A petal horizontal frequency vector T' h = (C^ j ,-)i<i^rf;i<i<-Dj; which is 
partitioned as J-' h = J-' h lin l±) T' h ind; which is a limit function of h, where 
•F'h lin' -F'h ind have dimensions D[ in , D' ind respectively; 

(v) A vertical frequency r\ : G?^_1j^ ^ — > R with dimension vector + D' 
which obey the following properties: 

(i) (T' h lin is bracket-linear) For every 1 ^ i ^ d and 1 ^ j ^ D\ lin , iftere 
existe Oiij £ *T and ftj £ *R such that 

t,'h,i,j = ki mod 1 (10.2) 

for all h £ H' , and furthermore that the map h i— > jj * s a Freiman 
homomorphism on H' . 

(ii) (Independence) -For almost all additive quadruples (hi, hi, ^3, ft-i) in -ff, 

4 3 

^mdWy^indWl+J^lin 

i=l i=l 

is independent. 

(iii) (Representation) For aZZ h € H' , Xh has a total frequency representation 
(D* + D',F*l)F h ,r)). 

Proof. Each Xh thus has a total frequency representation (Dh, Fh,T)h)- The space of 
representations is a cr-limit set, so by Lemma lA.lll we may assume that (Dh, Fh, Vh) 
depends in a limit fashion on ft. 

The number of possible dimension vectors is countable. Applying Lemma |A.9[ 
and passing from hi to a dense subset, we may assume that D = Dh is independent 
of ft. 
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We then apply Lemma 1 10. 71 to the (J-h)heH, obtaining a dense subset if' of 
if, dimension vectors Z?* = Z5„ ind + £>*, rat and & — D[ in + D' ind + D' sml , a core 
horizontal frequency vector J 7 * = J-* t i n d W J-Vrat, and petal horizontal frequency 
vectors T' h = T' h lin W T' h ind W T' h sml for each h E H' with the stated properties. 

Applying Lemma fl0.8[ we see that for each h G if' , \h has a total frequency 
representation 

for some vertical frequency 77^. Applying Lemma 110.91 we conclude that \h has a 
total frequency representation 

(f*,ind + A'in + A'nd; ^ *,ind W Fh,\m ^ ^,ind) ^ft) 

for some vertical frequency rj' h . The number of vertical frequencies 77^' is countable, 
so by Lemma I A. 9 1 we may assume that 77 = 77^' is also independent of h. The claim 
then follows. □ 

11. Obtaining bracket-linear behaviour 

We return now to the task of proving Theorem [721 To recall the situation thus 
far, we are given a two-dimensional nilcharacter x € ^Muiti^C*^ 2 ) anc ^ a family 
of degree-rank (s — l,r#) nilcharacters (xh)heH depending in a limit fashion on 
a parameter h in a dense subset if of [[A]], with the property that there is a 
function / £ L°°[A] such that x(/i, •) ® X/j (a — 2)-correlates with / for all h E H. 
Using Proposition 17.31 to eliminate / and x, and refining if to a dense subset if 
necessary, we conclude that the nilcharacter (|7.2p is (s — 2)-biased for many additive 
quadruples hi + /12 = /i3 + /14 in H . We make the simple but important remark that 
this conclusion is "hereditary" in the sense that it continues to hold if we replace 
H with an arbitrary dense subset H' of H, since the hypothesis of Proposition 17.31 
clearly restricts from H to if' in this fashion. 

Next, we apply Lemma [10.101 to obtain a dense refinement if' on if for which 
the Xh have a frequency representation involving various types of frequencies: a 
core set of frequencies J 7 *, a bracket-linear family (J 7 ^ i in )heH' of petal frequencies 
and an independent family (T' h i nc iWff' °f petal frequencies. 

iiiiijj .mine The main result of this section uses the bias of (I7.2p . combined with 
the quantitative equidistribution theory on nilmanifolds (as reviewed in Appendix 
|D|) to obtain an important milestone towards establishing Theorem 17. 21 namely 
that the independent petal frequencies T' h ind do not actually have any influence 
on the top-order behaviour of the nilcharacters Xh, and that the bracket-linear fre- 
quencies only influence this top-order behaviour in a linear fashion. For this, we 
use an argument of Furstenberg and Weiss [T3] , also used in the predecessor [55] to 
this paper. See also [3^ for another exposition of this argument. ======= The 

main result of this section uses the bias of (I7.2[) . combined with the quantitative 
equidistribution theory on nilmanifolds (as reviewed in Appendix [D| to obtain an 
important milestone towards establishing Theorem 17.21 namely that the indepen- 
dent petal frequencies J-' h ind do not actually have any influence on the top-order 
behaviour of the nilcharacters Xh, and that the bracket-linear frequencies only in- 
fluence this top-order behaviour in a linear fashion. For this, we use an argument of 
Furstenberg and Weiss [14] that was also used in the predecessor [28] to this paper. 
See [29j for another, somewhat simplified, exposition of this argument. HHHi 
.r207 
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We begin by formally stating the result we will prove in this section. 

Theorem 11.1 (No petal-petal or regular terms). Let f,H,x,(Xh)heH be as in 
Theorem \7.2\ and let H' , D*,D' , D[ in , D' ind , T*,T' h , T' h YuV T' h ind , -q be as in Lemma 

\10.10\ Let w £ G D * +D be an r* — 1-fold commutator of ei 1> j 1 , . . . , ej r j r , where 
1 ^ ii, . . . , i Tl> ^ s — 1, ix + . . . + i rt = s — X, and 1 ^ ji D*.^ + D[ for all I with 
1 < I < r* . 

(i) (No petal-petal terms) If ji > it for at least two values of I, then 
rj(w) = 0. 

(ii) (No regular terms) If ji > D f il +-D( ini for at least one value of I, then 
rj(w) = 0. 

(iii) (No petal-petal terms) If ji > -D*,^ for at least two values of I then r](w) = 
0. 

(iv) (No regular terms) If ji > D*^ + £>i'i n for at least one value of I then 
i](w) = 0. 

The remainder of this section is devoted to the proof of Theorem lll.il 
Let the notation and assumptions be as in the above theorem. From Proposition 
I7.3l we know that, for many additive quadruples (hi, h%, ha, h^) in W , the sequence 
(|7.2|) is (s — 2)-biascd. Also, from Lemma 110.101 we see that for almost all of these 
quadruples, the horizontal frequency vectors 

4 

F*,ind W |+) .FjK.tad W |+| F hiM (11.1) 
i—l i=a,b,c 

are independent for all distinct a, b, c 6 {1,2,3,4}. We may therefore find an 
additive quadruple (hi, hi, /13, /14) for which (|7.2p is (s — 2)-biased, and for which 
(|11.1[) is independent for all choices of distinct a, b, c G {1, 2, 3, 4}. 

Fix (hi, hi, /13, /14) with these properties. We convert the above information to 
a non-equidistribution result concerning a polynomial orbit. 

For each i — 1,2,3,4, we see from Lemma [10.101 that Xh t has a total frequency 
representation 

(D*+D',T*WF' hi ,rj). 

We write 

J 7 * WJ^. = (£hi ,j,fc)l<j<s-l;l<fc<£>j , 

where 

thus the frequencies associated to J 7 *, F' h . ind , T' h . lin correspond to the ranges 
1 < k < -D*j-, < k < £>».j + A'ndj' an d + A'ndj < k ^ Dj respectively. 
As (I7.2[) is (s — 2)-biased, we conclude that 

\&ni=[N]Xhi(n) ® X/i 2 (« + - ® A 77 ^™) ® Xft7(« + - MV'fci,/i2,ft3,>u( ri )l > 1 

(11.2) 

for some degree ^ (s — 2) nilsequence iphtMMM' where is defined to be zero 
outside of [N]. As any cutoff to an interval can be approximated to arbitrary 
standard accuracy by a degree 1 nilsequence, and s ^ 3, we see that the same claim 
holds if Xh is instead extended to be a nilsequence on all of *Z. 
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From Definition 16.11 and the total frequency representation of the Xhfi we can 
rewrite the sequence inside the expectation of (|1 1 .2[) as a degree-rank < (s — 1, r*) 
nilsequence n \-> F(0(n)). Here G/Y is the product nihnanifolcQ 

G/Y:= mG W /r w ^ xG (0) /r (0) 

for some filtered nilmanifold G(o)/T(o) of degree-rank < (s — 1, r* — 1) and filtered 
nilmanifolds G(i\/Y(^ of degree-rank < (s — l,r*) for i = 1,2,3,4. The orbit is 
defined by 

O = (Or, ai 03, 04, Co) G * P oly(Z N -y (G/r) N ) 
where, for each i, j with 1 ^ i ^ 4 and 1 ^ j ^ s — 1 we have 



Taylor i (O w ) = 7t teori , j(Gci)/ r w ) U(i)( f{ '.a : ' ( n -3) 



,3,* \ 

where D := (Di, . . . 4>a\ : G D /Y D — > G^/Y^ is a filtered homomorphism 

and 7r Hor izj(G (l) /r (l) ) : (G ( ;))j ->■ Horizj(G (i )/r (i )) is the projection to the j th hori- 
zontal torus. Finally F G Lip(*(G/F)) is defined by 

i? (0(i)( i (i))a ; (i), ■ • ■ ,<f>(4)(t{4))x(4),y) = 

e((»?(*(i)) + r ?(*(2)) -»?(*(3)) -J?(*(4))))i ? (a;(i) ! --- ) ^(4),2/) (H-4) 

for all (x(i), . . . ,X(4),y) G G/Y and f(x), . . . ,t(4) £ G^_ 1 n. (Note that the shifts 
by hi — hi in (|11.2j) do not affect the Taylor coefficients of O(i), thanks to the 
remarks following Definition l9.6l ) 
By hypothesis, we have 

\E ne[N] F(0(n))\ »1. 
Applying Theorem ID. 6[ we conclude that 



F(ex) dfi(x)\ > 1 (11.5) 

IGp/Tp 

for some bounded e G G and some rational subgroup Gp of G with the property 
that 

7THori Zj (G)(G P D G (l) ) > (11.6) 

for all 1 < j < s — 1, where 

Ef := {a; £ Horiz 3 (G) : &(:e) = for all G Ej} 

and Ej ^ Horizj(G/r) is the group of all (standard) continuous homomorphisms 
£j : Hori Zj (G/r) -> T such that 

€,-(T a yIor i (0)) = 0(JV-^). 
From f| 1 1 .4[) and (|11.5p we conclude the following lemma. 



^Unfortunately, there will be several types of subscripts on nilpotent Lie groups G in this 
argument. Firstly one has the factor groups G^y Then one also has the degree filtration groups 
Gd and the degree-rank filtration groups G^ r ) of G (and also the analogous subgroups (Gj;))^, 

(('(i))(d,r) of the factor groups ), as well as the free nilpotent groups G B = GP_ 1 r <. Finally, 
a Ratner subgroup Gp of G will also make an appearance later. We hope that these notations 
can be kept separate from each other. 
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Lemma 11.2. The group Gp n {{G(i))( a -x, r *) x {id} x {id} x {id} x {id}) is an- 
nihilated by rj. 

Proof. Let g — (gru , id, id, id, id) lie in the indicated group. Then g is central, and 
so from the invariance of Haar measure we have 



F(ex) dfi(x) = / F(gex) dfi(x). 

'Gp/Vp JGp/Vp 

On the other hand, from (|11.4I) we have 

F(gex) dfi(x) = e(rj(g)) / F(ex) dfi(x). 

'Gp/Vp JGp/Tp 

Comparing these relationships with (|11.5[) we obtain the claim. □ 

We now analyse the group Gp further. For each 1 ^ j ^ s — 1, let V123J 
denote the subgroup of Horizj(G(i)) x Horizj (G(2) ) x Horizj(G(3)) generated by the 
diagonal elements 

(<£(i) ( e i,&)> 0(2) (ej.fc), 0(3) (ej-,fc)) 
for 1 ^ k sC L**j-, and by the elements 

(0(i) ( e j,k), id, id), (id, (2) (e^ k ), id), (id, id, (3) (e jjk )) 

for D*_j < k ^ Dj. We define the subgroup Vi24,j of Horizj (Gm ) x Horizj(G( 2 )) x 
Horizj (G(4) ) similarly by replacing (3) with (4) throughout. 

Lemma 11.3 (Components of Gp). Let 1 ^ j ^ s — 1. TTien f/ie projection of 
Gp fl Gj to Horizj(G(!)) x Horiz^ (G( 2 )) x Horiz.,(G(3)) contains V123.J. Similarly, 
the projection to Horiz^Gf!)) x Horizj(G( 2 )) x Horizj(G( 4 )) contains Vi2i,j- 

Proof. We shall just prove the first claim; the second claim is similar (but uses 
{a, b, c} = {1, 2, 4} instead of {a, b, c} = {1, 2, 3}). 

Suppose the claim failed for some j. Using (|11.6I) and duality, we conclude that 
there exists a f j € 3j which annihilates the kernel of the projection to Horizj (G(i) ) x 
Horizj(G( 2 )) x Horizj(G(3)), and which is non-trivial on Vi2s,j- As £j annihilates 
the kernel of the projection to Horizj(G(i)) x Horizj(G( 2 )) x Horizj(G(3)), we have 
a decomposition of the form 

£j , 35(2) , X(3) , X(4) , 1(0) ) = f (1) J (z(l) ) + £(2),j (^(2) ) + £(3),j (#(3) ) 

for ^(i) € Horizj^G/j)) for i = 1,2,3,4,0, where : HoriZj(G(j\) —> R for 

i = 1,2,3 are characters. 

By definition of 3j , we conclude that 

^D^Taylor^Od))) + £ (2)J (Taylor^))) + ^ (Taylor^))) = O(iV^). 
However, from pi. 31) we have 

^(i),,-(Taylor 3 .(0 (i) )) = ^c^^h^k (H.7) 
fc=i 

where the c^j k are standard integers, defined by the formula 

C(i)j,fe :=C(i),j(0(i)(ej,fc))- (H.8) 

From the independence of (|11.1[) with {a, 6, c} = {1,2,3}, we conclude that the 
c (i),j,k au vanish for i = 1,2,3 and D* j < k ^ Dj, and that the sum cmwh, + 
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c (2),j.k + c {3).j,k vanishes for 1 ^ k ^ D*,j- But this forces £j to vanish on Vm,j, 
contradiction. □ 

We now take commutators in the spirit of an argument of Furstcnberg and Weiss 
[14] (see also [39j [50] for similar arguments in completely different settings) to 
conclude the following result which roughly speaking asserts that all "petal-petal 
interactions" are trivial. 

Corollary 11.4 (Furstenberg- Weiss commutator argument). Let w be an — 1- 
fold iterated commutator of generators ej lt kii ■ ■ • > e j r *,fer, with 1 ^ 3l ^ s ~ 1> 
1 *S ki ^ Di for I = 1, . . . , r* and Ji + • • • + i r , = s — 1 (f/ms u> /ias "degree- 
rank (s — l,r*)" m some sense). Suppose that at least two of the generators, say 
e hMi e 32,k 2 7 are "petal" generators in the sense that k\ > -D*,^ and k2 > -D*,j 2 - 
Then ((f)n\(w),id,id, id, id) 6 Gp. 

Proof. For e^/^ , we may invoke Lemma lll.3l and find an element gj ±J ki of GpHGj 1 
for which the coordinates 1,2,3 are equal (modulo projection to 

Hori Zil (G (1) ) x Horiz^Gp)) x Horiz^ (G (3) )) 

to (0i (ej 1 ) , id, id) . Similarly, we may find an element g'^ k2 of Gp D Gj 2 for which 
the coordinates 1,2,4 are equal (modulo projection to 

Horiz j2 (G(i)) x Horiz^ 2 (G( 2 ) ) x Horiz j2 (G(4))) 

to (0i(e j2 .fc 2 ), id, id). Finally, for all of the other ej t k, we can find elements g" k of 
Gp n Gj for which the first coordinate is equal (modulo projection to Horiz J (G(i))) 
to 4>(i) (e^fc). If one then takes iterated commutators of the gj 1 ,k 1 ^9j 2 k 2 '9j k m the 
order indicated by w, we see (using the filtration property, the homomorphism prop- 
erty of </>(!), and the fact that the Gi/Ti have degree ^ (s— 1, r*) for i = 1, 2, 3,4 and 
degree < (s — l,r* — 1) for i = 0) that we obtain the element ((f>n)(w), id, id, id, id). 
Since the iterated commutator of elements in Gp stays in Gp, the claim follows. □ 

From Lemma 111.21 and Corollary 111.41 we immediately obtain the first part (i) 
of Theorem 111.11 We now turn to the second part of the theorem. For this, we 
need two further variants of Lemma Til. 31 For any 1 ^ j ^ s — 1, let Mnd.j be the 
subspace of Horiz^ (G(i)) x Horizj(G( 2 )) x Horizj(G( 3 )) x Horiz^ (G(4) ) generated by 
the elements 

(0(i) ( e 3,k), 0(2) (ej,k), 0(3) (ej.fc), 0(4) (ej.fc)) 
for 1 ^ k ^ D^j and the elements 

(0(i) ( e j,k), id, id, id), (id, (2 ) {e j>k ), id, id), (id, id, (3) (e jjk ), id), (id, id, id, (f> (4) (e,-, fe )) 

for /;.„, k ■ n.. } ■ »L, r 

Lemma 11.5 (Components of Gp, II). Let 1 ^ j ^ s — 1. TTien i/ie projection 
of Gp (~l Gj to Horizj (G(!)) x Horizj(G( 2 )) x Horizj(G( 3 )) x Horizj(G(4)) contains 
Vi n d,j ■ 

Proof. Suppose the claim failed for some j. Using (I11.6P and duality, we con- 
clude that there exists a € Sj which annihilates the kernel of the projection to 
Horizj(G(i)) x Horizj(G(2)) x Horizj (G( 3 ) ) x Horizj(G(4)), and which is non-trivial 
on Vindj - In particular, we have a decomposition of the form 

4 
i=l 
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for X(i) £ Horizj(G(i)) for i = 1,2,3,4,0, where ■ Horizj (G(i) ) — > K for 

i = 1, 2, 3, 4 are characters. 

By definition of Sj , we conclude that 

4 

^e W ,i(Taylor i (0 (i) ))= 0(N^). 
i=i 

Inserting ()11.7|) . we conclude that 

Dj 4 
k=l i=l 

The left-hand side is an integer linear combination of the degree j frequencies in 

4 4 

,ind 

i=l i=l 

Using the Freiman homomorphism property from Lemma 110.101 we can eliminate 
the role of J~h l , ii n , leaving only 

4 3 
,ind ,ind ,lin- 

But this is just for {a,b,c} = {1,2,3}. We conclude that the coefficients 

of the left-hand side of (111.10[) in this basis vanish, which in terms of the original 
coefficients ctfij^ rneans that 

4 
i=l 

for 1 ^ k ^ D*,j, and 

C(i),j,k = 

for D*,j < k ^ D* j +D' in dj- But this forces £j to vanish on Vi n d,j, a contradiction. 

□ 

We now apply the commutator argument to show that "independent" frequencies 
also ultimately have a trivial effect. 

Corollary 11.6 (Furstenberg- Weiss commutator argument, II). Let w be an (r* — 
l)-fold iterated commutator of generators ej u k l} ■ ■ ■ i e j r ,.k rs , with 1 < # < s — 1, 
1 ^ ki ^ Di for I — 1, . . . , r* and ji + ■ ■ ■ + j r « = s — 1. Suppose that at least 
one of the generators, say ej lt ki> is an "independent" generator in the sense that 
D* dl < ki < D tJ1 + D[ nd ji . Then id, id, id, id) £ G P . 

Proof. We may assume that ki ^ -D*j, for all 2 ^ I ^ r*, as the claim would follow 
from Corollary II 1 .41 otherwise. 

For e.j u kxi we may invoke Lemma 111.51 and find an element gj ll k 1 of Gp l~l Gj 1 
for which the first 4 coordinates are equal (modulo projection to Horiz^ (Gm) X 
Horizjj (G( 2 )) x Horizj^ (G( 3 )) x Horizjj (G(4))) is equal to (4>(i) ( e ji,fci), id, id, id). For 
the other ej y k, we can find elements g'j k of Gp n Gj for which the first coordinate 
is equal (modulo projection to Horiz,,(G(i))) to <p^(ej.k)- Taking commutators of 
gj 1 .k 1 and g'j k in the order indicated by w, we obtain the claim. □ 

Combining Corollary II 1 .61 with Lemma Til. 2 1 we obtain the second part of Theo- 
rem ulu 
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12. Building a nilobject 

The aim of this section is to at last build an object coming from an s-step 
nilmanifold. Recall from the discussion in JJTJthat this object will be a multidegree 
(1, s — l)-nilcharacter x'(h,n), and that this completes the proof of Theorem 17.21 
This in turn was used iteratively to prove Theorem 17. 1[ the heart of our whole 
paper. It will then remain to supply the symmetry argument, which will take us 
from a 2-dimcnsional nilsequence to a 1-dimensional one; this will be accomplished 
in the next section. 

Let /, H, x, (Xh)heH be as in Theorem l7.21 If we apply Lemma llO.lOi we obtain 
the following objects: 

• A dense subset H' of H; 

• Dimension vectors D* = I^.ind + £>*,rat and D' = D[ in + D' iad + D' smV 
which we write as D* = (-D*,;)i=i\ £>*,ind = (-D*,ind,i)i=i > etc -! 

• A core horizontal frequency vector J 7 * = (C*.i,j)i^i^s-i;i^j^D» t) which is 
partitioned as J 7 * = J 7 *, ind W J~*,ia,t, with the indicated dimension vectors 

n' n' ■ 

• A petal horizontal frequency vector T' h = (£' h i j)i^i<s— i;i^j<Z3< > which is 
partitioned as F' h = T' h lin l±J T' h ind l±J T' h sml , which is a limit function of h 
and with the indicated dimension vectors D' Un , D' ind , D' sm i, 

• Nilmanifolds Gh/^h and Go,h/^o,h of degree-rank < (s — l,r*) and ^ 
(s — l,r* — 1) respectively for each h G H' , depending in a limit fashion 
on h; 

• Polynomial sequences gh,9o,h G *poly(ZN — >• (G/j)n) for each h £ if', 
depending in a limit fashion on h; 

• Lipschitz functions F h £ Lip(*(Gh/T h ,xGo,h/^o,h) -> for each ft, G if', 
depending in a limit fashion on h; 

• a filtered : G D ' +D — > Gh for each h £ H' , depending in a limit fashion 
on h; and 

• a character rjh ■ G,*^ lr > — >■ K for each h £ H', depending in a limit 
fashion on h 

that obey the following properties: 

• For every 1 ^ i ^ d and 1 < j < lin , there exists a;j G *T and 
Pij £ *M such that p0.2l) holds, and furthermore that the map h t-> £' hi * 
is a Freiman homomorphism on H' . 

• For almost all additive quadruples (hi, hi, h%, hi) in H, 



^.indWy^indWl+J^lu 



1=1 1=1 

is independent. 

• We have the representation 

Xh{n) = F h (g h (n)*r h) g 0jh (n)*r 0jh ) 

for every h £ H' . 

• 4>h ■ G D ' +D — > Gh is a filtered homomorphism such that 

F h (<t) h (t)x,x ) = e(r) h (t))F h (x,x ) (12.1) 
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for all t £ G^*_+^ ); x £ G h /T hl and x £ G ,h/T ,h; 

• One has the Taylor coefficients 

Ta,ylor i (g h r h ) = Tr JloilZiiGh/ r h) (<f> h ( f[ e^)) (12.2) 

i=i 

for all 1 i ^ s - 1. 

There are only countably many nilmanifolds G/T up to isomorphism, so by 
passing from H' to a dense subset using Lemma IA.9I we may assume that 

G h /T h = G/T and G , h /T 0th = G /T 

are independent of h. Similarly we may take rjh = r) and <ph = <f> to be independent 
of h. From the Arzela-Ascoli theorem, the space of possible Fh is totally bounded, 
and so (shrinking e slightly if necessary) we may also assume that Fh = F is 
independent of h. 

For j with 1 < j < since Ch,i,j is independent of h, we can ensure that 

£h,i,j = 7i,j is a l so independent of h. Meanwhile, for D*^ < j ^ D^ i + D\ lin , from 
(|i0.2[) we may assume that £h,i,j takes the form 

6mj" = { a i,jh}/3i,j mod 1 

for some atj £ *T and € *R. By passing to a dense subset of H' using the 
pigeonhole principle, we may assume for each i,j, that {otijh} is contained in a 
subinterval around *0 of length at most 1/10 (say). 

We now wish to apply Theorem 111.11 to obtain more convenient equivalent rep- 
resentatives (in Sp R 1, '*''([A r ]) ) Xh for the nilcharacters Xh- Let G be the free Lie 
group generated by the generators eij for 1 ^ i ^ s — 1 and 1 ^ j ^ + D[ in i 
subject to the following relations: 

• Any (r — l)-fold iterated commutator of , . . . , j r with i± + . . . + i r > 
s — 1 vanishes; 

• Any (r— l)-fold iterated commutator of ii ly j 1 , . . . , &% T .j T with + . . . + i r = 
s — 1 and r > r* vanishes; 

• Any (r — l)-fold iterated commutator of ei 1 j 1 , . . . , e,v>2r m which Ji > -D*,^ 
for at least two values of I vanishes. 

We give this group a DR-filtration Gdr by defining Gu, r ) to be the group generated 
by the (r' — l)-fold iterated commutators of Jx , . . . , Sj , j , with ii + . . . + i r i ^ d 
and r' ^ r. We then let f be the discrete group generated by the e^ ; G/T is then 
a nilmanifold of degree-rank ^ (s — 1, r*). 

Let G* be the subgroup of G D * +D generated by (r— l)-fold iterated commutators 
ei 1 j 1 , . . . , ei r j r with i\ + . . . + i r = s — 1 in which > -D*,^ for at least two values 
of I, or ji > -D*,j, + -D[ in j for at least one value of L Then G* is a subgroup of the 

central group G,*^ lr . of G D * +D , and G is isomorphic to the quotient of G D ' +D 
by G*. We let <f> : G Dt+D ' — > G denote the quotient map. From Theorem 111 .11 
the character r\ : G®*^ r > — »■ IR annihilates G*, and thus descends to a vertical 
character fj : G( s _ 1 .,._ [ ) — > E. 

We select a function F £ Lip(G/r —5- S u ) with vertical frequency fj; such a 
function can be built using the construction (|6.3p . 
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We then define the polynomial sequences go,gh € *poly(ZN — > G®) by the for- 
mulae 

i=l J=l 

^(n):=n II g^*"® (12.4) 

and consider the nilcharacter 

Xh(n) := F(g (n)g h (n)*f). (12.5) 

These nilcharacters are equivalent to Xh in Symbj^p ([AT]), as the following 
lemma shows. 

Lemma 12.1. For each h 6 H' , \h and \h o,re equivalent (as nilcharacters of 
degree-rank (s — l,r*)) on [N]. 

Proof. Fix h. It suffices to show that Xh ® Xh is a nilsequence of degree < s — 1. 
We can write this sequence as 

n^^(. 9 ;(n)*r'), (12.6) 
where G := G x Go x G, V := T x T x f, € *poly(ZN -> G^) is the sequence 

fffcM : = (9h{n),g ,h(n),go{n)g h (n)) 
and F£ G Lip(*(G'/r')) is the function 



Fh( x > x o,y) ■= F h (x,x )®F(y). 
We define a DR-filtration G DR on G' by defining G| d ^ for (rf, r) £ DR with r ^ 1 
to be the Lie group generated by the following sets: 

(i) G(^ r+1 ) X (Go)(d, r ) x ^(d.r+l); 

(ii) {(0( 5 ) ; id^( 3 )):.geGg : + 15 '}, 

with the convention that (d,d + 1) = (d + 1,0). We also set G'^ d ^ := G'^ rf ^ for 
d ^ 1. One easily verifies that this is a filtration. 

We claim that g' is polynomial with respect to this filtration. Indeed, the se- 
quence n <-> (id, go,h(n), id) is already polynomial in this filtration, so by Corollary 
IB. 41 it suffices to verify that the sequence 

n H- (g h (n), id, g Q (n)g h (n)) (12.7) 

is polynomial. We use Lemma IB. 91 to Taylor expand gh(n) — YitZodh i wnere 
9h,% € G(ifl) . From (|12.2p . one has 

9h,i = <t>{ II <::, )vmh\(;,. 2 . 

3 = 1 

By construction of the filtration of G', this implies that 

(.<//...• id. II <;;. modG*)< f,", .. 
i=i 
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Applying Corollary IB A\ we conclude that the sequence 

s-l D,, t +D[ 

n^{g h (n),id,U( [J e%''-')(?) mod G*) 
i=o j=l 

is polynomial with respect to the G' filtration. Applying the Baker-Campbell- 
Hausdorff formula repeatedly, and using (|12.3p . (112. 4|) , we see that 

nM-JJ( JJ ef^' j )(') mod G* 

i=0 j=l 

differs from the sequence n n- go(n)gh(n) by a sequence which is polynomial in 
the shifted filtration (G(d, r +i))(d,r)eDR' We conclude that (|12.7[) is polynomial as 
required. 

Next, we claim that F' h is invariant with respect to the action of the central 
group 

<Vi,r,) = (W>(5),id, 0(g)) : 5 G G$_i,r.)}' 
It suffices to check this for generators (<fi(w), id, w mod G*), where w is an (r* — 1)- 
fold commutator of . . . , ej r j r in G 13 with ii + . . . + i r = s — 1. There 

are two cases. If one has j/ > -D*,i, + -D[ in i for some Z, then w lies in G* and 
is also annihilated by 77, and the claim follows from (|12.ip . If instead one has 
ji ^ + D[ in j for all I, then the claim again follows from (|12.1j) together with 

the construction of fj and F. 

We may now quotient out G', Q n by G'/ a _ lr \ and obtain a representation of 
(112.61) as a nilsequence of degree-rank < (s — 1, r*), as desired. □ 

From this lemma and Lemma IE.8f ii) we can express \h as a bounded linear 
combination of Xh ® 4>h for some nilsequence iph of degree-rank $C (s — 1, r* — 1). 
Thus, to prove Theorem 17.21 it suffices to show that there is a nilcharacter \ £ 
S (1 ' S_1) (*Z 2 ), such that Xh{n) = x{h,n) for many h <E H' and all n <E [N]. 

We illustrate the construction with an example. Let 

G := G™ = {e t 1 1 e t 2 2 [e 1> e 2 ] t12 : h,t 2 ,t 12 e M} 

be the universal degree 2 nilpotent group (|6.ip generated by ei,e 2 . Let F be the 
Lipschitz function in equation (|6.3I) . Suppose 

Xh (n) := F( 5h (n)*r) 

with gh{n) := e 2 n e" hn , where ah '■= {Sh}j, and a,/3, 7 £ *R. As computed in 
we have 

^fc(fl'h( n )*r) = (f>k{othn mod l,/3n mod l)e({a h n}/3n) 

for some Lipschitz function <f>^ : T 2 — > C. We would like to interpret the function 
(h,n) M> Xh(n) as a nilcharacter in 3^^{.j(*Z 2 ). The first task is to identify a 
subgroup Gpetai of the group G representing that part of G that is "influenced by" 
the petal frequency au\ more specifically, we take G po t a i to be the subgroup of G 
generated by e\ and [ei, e 2 ], that is to say 

Gp C tai = (ei, [ei, e 2 ])m = {e* 1 [ei, e 2 ]* 12 : t u t 12 £ K}. 
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Note that G po tai is abelian and normal in G. In particular G acts on G pc tai by 
conjugation, and we may form the semidirect product 

G k Gp tai := {(g,gi) : £ G,5i G Gpctai}, 
defining multiplication by 

(3,31) • {g',g'i) = {99,91 g'i), 

where a b := 6 _1 a& denotes conjugation. 

Now consider the action p of M on G k G po t a i defined by 

p(t)(g,gi) ■= {gg\,9i)- 

We may form a further semidirect product 

G' :=KK p (GKG po tal), 

in which the product operation is defined by 

(*,(<?,<?!)) • (t',(g',g[)) = (t + t',p(t')(g, gi ) • 

G' is a Lie group; indeed, one easily verifies that it is 3-step nilpotent. We give G' 
a N 2 -filtration: 



G (0,0) 


= G' 




G (l,0) 


= {(*,(<?, id)): 


t e R, g G Gp C tai} 


G (l,l) 


= {(0,(.9,id)) 


3 S Gpctai}, 


G (l,2) 


= {(0,(.9,id)) 


ff e [G,G]}, 


G '(0,l) 


= {(0,(ff,3i)) 


: .g e Gpctai; 31 £ G pc tai} 


G (0,2) 


= {(0,(3,3i)) 


:3,3ie[G,G]}, 



with G- ■ := {id} for all other G N 2 . One easily verifies that this is a filtration. 
Inside G' we take the lattice 

r' :=zx p (r><rpctai), 

where r pc tai :=Tn G po tai- Now consider the polynomial g' : 1? — > G' defined by 
g'(h,n) := (0, (ef, ej n )) ■ (6h, (id, id)) 

and observe that 

g'(h, n)V = (0, (ef , e?")) • ({5h}, (id, id))r' 
= ({<5M,(efe^",eD)r'. 

For a dense subset i?", {5/i} is in a small interval /, and let ip be a smooth cutoff 
function supported on 27. Take F' : G' /V — ^ C 13 to be the function defined by 

whenever t £ I and otherwise. Then we have for h G H" 

F'(g'(h,n)f) = F{4 n e\ 5h ^ n T) = Xh (n), 

giving the desired representation of (h,n) n> Xh(n) as an (almost) degree (1,2) 
nilcharactcr. 

We now turn to the general case. Our construction shall proceed by an abstract 
algebraic construction. Let G pe tai be the subgroup of G generated by (r — l)-fold 
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(r > 1) iterated commutators of ei 1 j 1 , . . . , &i T ,j r in which ji > -D*,;, for exactly one 
value of I. Then G pe tai is a rational abelian normal subgroup of G. To see that 
Gpetai is normal, ones uses the equalities 

%j [9, h]e hJ = [e~jgeij,e~jhe^] and e~jge hJ = g[g, e^}, 

the commutator identities in equation (|3.1j) . and the fact that any iterated commu- 
tators of &i lt j 1} . . . , &i r ,j r in which ji > for more than one value of I is trivial 
in G. 

In particular, G acts on G pe t a i by conjugation, leading to the semidirect product 
G tx Gpetai of pairs (3,31) with the product 

{9,9i){9',9i) ■= (99', 9i g'i)- 
Next, let R be the commutative ring of tuples t — (Uj)i^i<^ s -i-D, i<js^D* i+D' v 
with tij G K, which we endow with the pointwise product. For each t 6 R, we can 
define an homomorphism g 1— > g on G, which we define on generators by mapping 
to e*j for D*j < j ^ + D' lin i; but preserving e^j for j < £>*,i. Such 



e 2J ulJ ^ij' 

a homomorphism is well-defined as it preserves the defining relations of G. We 
observe the composition law 

for g G G and t, f' G i?. Also, on the abelian subgroup G po tai on G, we see that 

<?*/ = 9 t+t ' (12.8) 

as can be seen from the Baker-Campbell-Hausdorff formula (|3.2p . We can thus 
express 

~g h (n) = 9i(n) {ah} (12.9) 
where g\ G *poly(ZN — s> (GpctaOw) is the polynomial sequence 

i=l j=D« t t+l 

and {a/i} G i? is the element 

{ah} :— ({oiijh})i^i^ s -i. t D, :i <j^.D»,i+D{ in 

The homomorphism g \— > g preserves G pe tai, and is the identity once G po t a i is 
quotiented out. As a consequence we see that 

(ggig- 1 ) 1 = gg\g- x (12.10) 

for any g G G and g x G G pctal . 

We can now define an action p of R (viewed now as an additive group) on 
G k Gp t a i by defining 

p(t)(g,gi) ■= {gg\,gi)\ 

the properties (112. 8[) . (112. 10p ensure that this is indeed an action. We can then 
define the semi-direct product G' := R x p (G k G pe tai) to be the set of pairs 
(t, (g,gi)) with the product 

(t,(g,g 1 ))(t',(g',g' 1 )) = (t + t',p(t')(g,g 1 )(g',g , 1 )). 

This is a Lie group. We can give it a N 2 -filtration (G/ dl d2 ))(di,d 2 )eN 2 as follows: 

(i) If rfi > 1, then G' (di d2) :={id}. 
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(ii) If d\ = 1 and d.2 > 0, then G'^ d \ consists of the elements (0, (g, id)) with 
g G Gd 2 l~l Gp C tai- 

(hi) If di = 1 and rf 2 = 0, then G'^ Q <. consists of the elements (t, (g, id)) with 

t G R and g G G pctah 

(iv) If di = and d 2 > 0, then G', Q d ^ consists of the elements (0, (g,gi)) with 

g G G d2 and g x G G pc tai H G^ . 

(v) G' (00) = G'. 

One easily verifies that this is a filtration of degree ^ (1, s — 1) with G', j = G'. 

We let r' be the subgroup of G consisting of pairs (t,(g,gi)) with g G T, 
.9i G Tpctai, and with all coefficients of t integers. One easily verifies that T' is 
a cocompact subgroup of G', and that the above N 2 -filtration of G' is rational with 
respect to V, so that G' /V has the structure of a filtered nilmanifold. 

We consider the orbit O' G *poly(Z2, 2 -> (G'/T'V) defined by 

O'(M) := (0,( ffo (n),5i(n)))(a/ l ,(id,id))*r / , 

where 

aft, := (ckij-/i)i^i^s-i;i5,. i <j<D», i +i5 1 ' in i ■ 

As <7oj 5i were already known to be polynomial maps, and the linear map ft i— > aft, 
is clearly polynomial also, we see from Corollary IB. 41 and the choice of filtration on 
G' that O' is a polynomial orbit. 

Now we simplify the orbit. Working on the abelian group R, we see that 

(aft, (id, id))T' = ({ah}, (id, id)) "I", 

and then commuting this with (0, (go(n), gi(n))), we obtain 

0'(h,n) = ({^^(nJftWW.j^n)))^. (12.11) 

Recall that for many ft, G H that each component {ai.jft,} of {aft} lies in an interval 
Ii.j of length at most 1/10. Let be the interval of twice the length and with 
the same centre as Iij , and let tp^j : R — > K be a smooth cutoff function supported 
on ijj. We then define a function F' : G' /V -> by setting 

f'(((k J )i^-i;D», I<J ^., + / J;ini ,(3,3i))*r') := ( J] I] ^, i (t i , j ))F( 5 *f) 

whenever (g,g\) G G k G po tai and tij G for all 1 ^ i ^ s — 1 and D* i < j < 
+ -Duni) with F' set equal to zero whenever no representation of the above 
form exists. One can easily verify that F' is well-defined and Lipschitz. Since F 
has vertical frequency fj, F' has vertical frequency rf : G'^ s _ 1 -. — > R, defined by 
the formula 

V'((0,(g,id)) :=i}(jg) 

for all g G G s _i. From (fT2~5|) , (TT23)) and (|12.11l) . we see that for many ft G ZT we 
have 

X/ l (n)=F'o0'(/ l ,n) 
for all n G [iV]. By construction, F' o O' G S^"^ (*Z 2 ), and Theorem O follows. 



an inverse theorem for the gowers u 3 + 1 [jvj-norm 71 

13. The symmetry argument 

In this, the last section of the main part of the paper, we supply the symmetry 
argument, Theorem 17.41 we recall that statement now. 

Theorem 17.41 Let f 6 L°°[N], let H be a dense subset of [[N]], and let \ £ 
s (l, s -l)(* Z 2) be such that A? j < ( s _ 2 ) -correlates with x(h, •) for all he H. Then 
there exists a nilcharacter £ S S (*Z) (with the degree filtration) and a nilsequence 
^ G Nil c (*Z 2 ) (with the multidegree filtration), with J given by the downset 

J := {(i,j) EN 2 :i+j ^s - l}U{(i,s-i) : 2 ^ % < s}, (13.1) 

such that x(^: n ) * s a bounded linear combination of 0(n + h) (g) Q(n) ® n). 

Example 13.1. Suppose that s = 2, x(h, u) = e(P(/i, n)), and P(/i, n) : *Z 2 *R 
is a symmetric bilinear form in n,h. Then observe that 

X(h, n) = 9(n + h)e(n)^(h, n) (13.2) 

where &(n) := e(ip(n, n)) and ^(ft., n) := e(—^P(h, h)), which illustrates a special 
case of Theorem 17.41 More generally, if s ^ 2 and x(/i, n) = e(P(h, n, . . . , n)) with 
P(/i, Tii, ... , "-s-i) : *Z S — > *R a symmetric multilinear form, then we have (113. 2[) 
with 0(n) := e(~P(n, . . . , ti)), and ^(h,n) a polynomial phase involving terms of 
multidegree (i, s — i) in h, n with 2 ^ i ^ s. Thus we again obtain a special case 
of Theorem 17.41 Note how the symmetry of P is crucial in order to make these 
examples work, which explains why we refer to Theorem 17.41 as a symmetrisation 
result. Morally speaking, this type of symmetry property ultimately stems from the 
identity A/jAfc/ = AjA^/. We remark that an analogous symmetrisation result 
was crucial to the analogous proof of GI(2) in [21] (see also [51]), although our 
arguments here are slightly different. 

From the inclusions at the end of $6] x(/i, n) is a nilcharacter on Z 2 (with the 
degree filtration) of degree ^ s. For similar reasons, any nilsequence &(h, n) of 
degree sC s — 1 (using the degree filtration on Z 2 ) will automatically be of the form 
required for Theorem l7.4l In view of this and Lemma IE. 81 we see that it will suffice 
to obtain a factorisation of the form 

\xh»([[N]]x[N]) = [@(« + h)]E°({[N]]x[N]) ~ [&(n)]s>([[N]}xlN]) + n)] S *([[N]] x [N]) 

where O £ S S (*N) is a one-dimensional nilcharacter of degree ^ s (which automat- 
ically makes (h,n) i-> @(n) and (h,n) i-> Q(n + h) two-dimensional nilcharacters of 
degree ^ s, by Lemma |E. 8f vi) ) . and W E S S (*N 2 ) is a two-dimensional nilcharacter 
of multidegree 

c{yeN 2 :! + j^;jO- 2}. (13.3) 

The set of classes [^(/i, Ti)]Es([[Ar]]x[An)) with \P of the above form, is a subgroup of 
the space Symb s ([[iV]] x [N]) of all symbols of degree s nilcharacters on [[N]] x [N]. 
Denoting the equivalence relation induced by these classes as =, our task is thus to 
show that 

[x]h s ([[JV]]x[JV]) = + fr)]s*([[iV]]x[iV]) - [9( n )]H 3 ([[JV]]x[JV])' 

In view of Theorem IE. 101 and Lemma [E.8l (vii), there is a nilcharacter % on *Z S 
of degree (1, . . . , 1) which is symmetric in the last s — 1 variables, and such that 

[x(ft>«)]H'(«z») = s[x(h,n,...,n)] 3 s(* Z 2y (13.4) 
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Inspired by the polynomial identity 

shn 8 ^ 1 = (n + h) s - n s - . . . 

where the terms in . . . are of degree s in h, n but of degree at most s — 2 in n, we 
now choose 

9(n) := x(n, ...,n). 

From Lemma IE.8I (vi) we see that O is a nilcharacter of degree ^ s. Our task is 
now to show that 

[x(n + h,...,n + h)] 3 s {[[N]]x[N]) -[x(n, n)] H »([[AT]] x [iv])- 

- s[x(h, n..., n)] 3s{[[N]]x[N]) = 0. (13.5) 

To manipulate this, we use the following lemma. 

Lemma 13.2 (Multilinearity). Let x be a nilcharacter on Z s (with the multidegree 
filtration) of degree (1, . . . , 1). Let m ^ 1 be standard, and let L\, . . . , L s : Z m — > Z 
and L' x : Z m — > Z be homomorphisms. Then we have linearity in the first variable, 
in the sense that 

[x(Li(n) + L'-tin), L 2 (n), L s {n))]^ { , zm) = [x(L 1 (n), L 2 (n),..., L s (n))] S s { * zm) 

+ ix( L 'i(n), L 2 (n), . . . ,£ s (n)] H B(. Z m), 

where ft = (ni, . . . ,n m ) are the m independent variables of *W l , and Z m is given 
the degree filtration. We similarly have linearity in the other s — 1 variables. 

Proof. We prove the claim for the first variable, as the other cases follow from 
symmetry. From Lemma IE.3I and Lemma IE.8l , vi') , it will suffice to show that the 
expression 

X(hi + h[, h 2 ,..., h a ) ® x{hi,h 2 , ...,h s )® x(h' 1 ,h2, ■ ■ -,h s ) (13.6) 

is a degree < s nilsequence in hi, h'i, h,2, ■ ■ ■ , h s (using the degree filtration). 

Write x(hi, . . . ,h s ) = F(g(h 1 , . . . , h s )*T), where G/T is a N s -filtered nilman- 
ifold of degree ^ (1,...,1), F S Lip(*(G/r)) has a vertical frequency, and g e 
*poly(Z^ s — > Grs). Then the expression (I13.6|) takes the form 

F(g(h ll h' 1 ,h 2 ,...,h s yr 3 ) 

where g : *Z S+1 — > G 3 is the map 
g(hx,h\,h 2 , ...,h a ):= (g(hx + h[,h 2 , h s ),g(hi,h 2 , ■ ■ . ,h s ),g(h' 1 ,h 2 , ... , h s )) 
and F 6 Lip(* (G/T) 3 ) is the map 

F(x u x 2 ,x 3 ) = F(xi)®F{x 2 )®F(x 3 ). 
By Lemma lB.9| we can expand 

g (h l ,...,h s )= n sbj. 

ii,...,i»={0,l} 

for some </, , 6 Gt^ i a \, where we order {0, 1} S lexicographically (say). 

We now N-filter G 3 by defining (G 3 )i to be the group generated by (G( il: ... iis )) 3 
for alHi, . . . , i s € N with i\ + . . . + i a > i, together with the groups {(5132, <?i, 32) : 
9ii 92 £ G(ii,...,i s )} f° r + - ■ - + *s = i- From the Baker-Campbell-Hausdorff formula 
(|3.2p one verifies that this is a rational filtration of G 3 . From the Taylor expansion 
we also see that g is polynomial with respect to this filtration (giving Z s+1 the 
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degree filtration). Finally, as F has a vertical character, we see that F is invariant 

with respect to the action of (G 3 ) s = {(3132,31,32) : 31,32 S Gn i)}- Restricting 

G 3 to (G 3 )o and quotienting out by (G 3 ) s we obtain the claim. □ 

Using this lemma repeatedly, together with the symmetry of x m the final s — 1 
variables, we see that we can expand 

[x(n + h, . . . ,n + /i)] s *(*z 2 ) = 

^2 [ • 1 ) h,...,h,n,..., n)] a »(*z2) + [xO, h, . . . ,h,n, . . . , n)] HS (*z2)) , 

i=o \ 3 J 

where in the terms on the right-hand side, the final j coefficients are equal to n, 
the first coefficient is either n or h, and the remaining coefficients are h. Note that 
a term with j h factors and (s — j) n factors will have degree (|13.3[) and thus be 
negligible as long as j 2. Neglecting these terms, we obtain the simpler expression 

[\(n + h,...,n+ /i)]e»(*z2) =[x(«, ■ ■ ■ , ri)] s -(*E>) + [x(h, n,..., n)] HS («z=) 

+ (s - l)[x(n, h,n,..., n)] H s(»z 2 )- 

Comparing this with (j 13 . 3[> . we will be done as soon as we can show the symmetry 
property 

- l)\x( h > n > ■ ■ ■ , n )h-([[N]]x[N}) = (s - l)[x(n,h,n, . . . , n)] H »([[jv]]x[JV]). (13.7) 

This property does not automatically follow from the construction of x- Instead, 
we must use the correlation properties of x, as follows. 

By hypothesis and Lemma [E.51 we have that for all h in a dense subset H of 
[[Nj], we can find a degree ^ s — 2 nilcharacter tp h such that /i( - + /i)/2(') correlates 
with x(^, "j •••)•) ® <Ph- By Corollary IA. 121 we may assume that the map h 1— > ip^ 
is a limit map. We set tfh = for /ij iJ. 

To use this information, we returnjto Proposition l8.4l Invoking that proposition, 
we see that for many additive quadruples (h\, hi, /13, hi) in [[N]], the sequence 

n h-> x(hi,n) (g) x(h2,n + hi — hi) ® x(h-3,n) ® x(/i4, 71 + fti — /14) 

<8> («■) ® ¥>fc a (" + ^1 - M ® <^ 3 ( n ) ® <Pfc 4 {n + hi- hi) 

is biased. 

We make the change of variables (h\ ,h2,ha, hi) = (h + a, h + b, h + a + b, h) and 
then pigeonhole in /i, to conclude the existence of an ho for which 

n i-> r(a, 6, 71) (g> i^ i0+a (n) ® (fih +b(n + a) ® Tp ho+a+b (n) <g> <^ Q (n + a) 

is biased for many pairs a,b € [[2iV]], where t = r^ is the expression 

T(a,b,n) \= x{ha + a,n)®x{ho + b,n + a)®x{ho + a + b,n)®x(ho,n + a). (13.8) 

Henceforth Tiq is fixed, and we will suppress the dependence of various functions on 
this parameter. From Lemma TE.31 r is a degree ^ 3 nilcharacter on *Z 3 (with the 
degree filtration). We record its top order symbol: 



^Here is a key place where we use the hypothesis s ^ 3 (the other is Lemma ll0.9l l. For s = 2 
the lower order terms in Proposition ^. 4l are useless; however a variant of the argument below still 
works, see |21| . 
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Lemma 13.3. We have 

[r(a, 6, 7i)] s «(. Z 3) = s(s - l)[x(&,a,n,...,ra)] 3 *(. Z 3) 
where by = we are quotienting by all symbols of degree ^ s — 3 in n. 

Proof. From (I13.4j) . (|13.8I) . Lemma [E~3l and Lemma [ETHI one has 

[r(a, b, rc)] H *(*z3) =s([x(a, ra, . . . , n)] H .(«z3) + [x(b, n + a,...,n + o)] s .(.z3)- 

- |x(a + &)« J ---)«)]s*(*zs))- 
Applying Lemma 113.21 in the first variable we simplify this as 

s([x(b, n + a,...,n + a)] H< , ( . Z 3) - [x(a, n, . . . , ra)] s «(*z 8 ))- 

Applying Lemma 113.21 in all the other variables and gathering terms using the 
symmetry of x m those variables, we arrive at 



s-2 

E- 



3=0 

where there are j occurrences of n and s — 1 — j occurrences of a. All the terms 
with j < s — 2 are of degree $J s — 2 in n, and the claim follows. □ 



From Lemma IE. 81 we know that iph +b(n + a) is a bounded linear combination of 
iph +b{n)®4'a,b{n) for some degree ^ s — 3 nilsequence ipa, b- Similarly for ifh (n+a). 
We conclude that 



n i-> r(a, 6, ra) (g) <^ 0+a W g> <Ph +b(n) <g ^/ l0+Q + h (» ® ^(n) 

is ^ (s — 3)-biased for many a, b £ [[27V]]. 

We will now eliminate the ip^ terms in order to focus attention on r. Applying 
Corollarv lA.121 we may thus find a scalar degree $J s — 3 nilsequence ip a , b depending 
in a limit fashion on a, b G [[27V]], such that 



\^a,be[[2N]];ne[N]T{a, h , n ) ® Vh a +a{n) <g tp ho+b (n)®ip ho+a+b (n)i 



g> <pho,k'(n + a)i> a ,b(n)\ > 1. 

We pull out the 6-independent factors iph +a(n) ®Tp h (n) and Cauchy-Schwarz in 
a, n to conclude that 



\^a,b,b'e[[2N]]:ne[N]T{a, b, n) ® r(a, V , n) ® ip ho+b {n) <g> Vh 0+ b' (n) 

®¥>ho+o+6(»)®¥'ho+o+6'(«)^'o,6,6'(»)| > 1, 

where (a, 6, 6') i— > ip a ,b,b' is a limit map assigning a scalar degree $J s — 3 nilsequence 
to each a, 6, Next, we make the substitution c := a + b + b' and conclude that 



\^cb,b'£il3N]];nG[N]T(c - b - b' , b, n) <g r(c - b - b' , 6', n) 



® (p ho+b (n) (g) <^ 0+6 ,(ra) (g tp ho+c _ b ,(n)ip ha+c _ b (n)ij' c b b ,(ri)\ > 1 

where (c, b, b') ^ ip' cbb , is a limit map assigning a scalar degree ^ s — 3 nilsequence 
to each c, 6, By the pigeonhole principle, we can thus find a cq such that 



\^b,b'e[[3N]];ne[N]a(b,b',n) <g ^(n) <g <p' 6 , (n)-0c o ,6,6'( n )l > 1 ( 13 - 9 ) 



where a = a Co is the form 



a(b, b', n) := t(c - b - b' , 6, n) (g t(c -b-b', b', n) (13.10) 
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and ip' b — ip' b C[j is the quantity 

<p' b (n) := (p ho+bt k(n) <8> <p ho+Co - b (n). 

We fix this cq. Again by Lemma [E.3[ a is a degree $5 s nilcharacter on *Z 3 , and 
we pause to record its symbol in the following lemma. 

Lemma 13.4. We have 

[a(b, 6', 7i)] s «(*z3) = s(s - l)[x(b + b',b - b',n, . . . , u)]a«(*z3) 

where by = we are quotienting by all symbols of degree ^ s — 3 in n. 

Proof. From (113. 10|) and Lemma IE.8I we can write the left-hand side as 

[r(-b - &',6,n)] H a(» Z 3) - [r(-6 - b', b', n)] H «(*z3)- 
Applying (| 13 . 3[) , we can write this as 

s ( s ~ !)([x(~ fo -b',b,n,..., 7i)]s«cz3) - [x(~ fo - b',n,..., ri)] s .(. Z s)). 
The claim then follows from some applications of Lemma 113.21 □ 

We return now to (|13.9[) , and Cauchy-Schwarz in 6',n to eliminate the ip' b ,(n) 
factor, yielding 

l E &i,&2,b'e[[3JV]] ; ™6[Ar]a(fri, b',n) ® a(b 2 , b' , n) <g> <g> (p' b2 (n)if>' b [ M b ,(n)\ > 1 

where 62, ^ V'fe & 2 6' is a limit map assigning a scalar degree ^ s — 3 nilse- 
quence to each 61,62,6'. Finally, we Cauchy-Schwarz in 61,62,71 to eliminate the 
( Pb 1 ( n ) ( Pb 2 ( n ) factor, yielding 

|E6i,6 2 ,fci,6ie[[3JV]];ne[iV]a(&i, n)®a{b 2 , b[,n) <g> a(6i, 6 2 , n)<8> 

®a(&2,MC,6 2 ,6i,fc 2 ( n )l » L 

Note how the y terms have now been completely eliminated. To eliminate the if>" 
terms, we first use the pigeonhole principle to find 6 , 6q such that 

\^b,b'e[[3N]]-,ne[N]a\b,b\n)ij b \ bo b , b , B (n)\ > 1 (13.11) 

where a' — a' bg b , is the expression 

a'(b, 6', n) := a(b, 6', n) ® a(6 , 6', n) ® a(6, 6q, n) ® a(b , 6g, n). (13.12) 

We fix this 60, 6q. Again, a 1 is a degree ^ s nilcharacter on *Z 3 . From Lemma Tl 3. 41 
and Lemma ll3.2l (and using Lemma |E. 81 to eliminate shifts by 60) we conclude 

[a'(6,6',n)] Hs( . z3) = s(s - l)([x(6, 6', n, . . . ,n)] H .(. z a) - [x(6', 6, n, . . . , n)] H »(*z3))- 

(13.13) 

Note the similarity here with (I13.7[) . 

From (|13.1ip . we conclude that the sequence n M- a'(b,b',n) is ^ s — 3-biased 
for many 6,6' 6 [[3JV]]. Applying Proposition 15.61 we conclude that 

||a'(6,6',n)|| c/3 -2 [A r ] > 1 

for many 6,6' G [[3 AT]]. We conclude (using Corollary IA.6I to obtain the needed 
uniformity) that 

E 6,fc'e[[3W]]l|a'(^&',«)ll^-2[7V] > !■ 
By definition of the Gowers norm, this implies that 

\K b ,b'M,...M 3 ^ 2 e[[3N]];ne[N]^(b, 6', hi,..., h s - 2 , n)l n (hi, h s ^ 2 ,n)\ > 1, (13.14) 
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where f2 is the polytope 

s-2 

il := {(7n,... ,/i s -2,n) :n + ^2ujh„-a G [N] for all u 6 {0,1} S ~ 2 } 
and cr is the expression 

s-2 

<r(b,tf,hi,...,h.- 2 ,n):= (g) CH Q '(M', n + ^WiM. (13.15) 

ue{o,i}'- 2 j=i 

with C being the conjugation map. 

From Lemma [E.3[ a is a nilcharacter of degree s on *Z S+1 . In the following 
lemma we compute its symbol. 

Lemma 13.5. We have 

[a(b, b',hi,..., h s _2, n)]s*(«z*+i) =«!([*(&, b',hi,..., /i s _ 2 )]e*(*z»+i) 

- [*(&', Mlj • • ■ , fr s -2)]s»(*z«+i))- 

Proof. From (|13.15p and Lemma TE. 81 we can write the left-hand side as 

s-2 

Y, (-l)^[a'(b,b',n + J2^hs-2)h'(^y, (13.17) 
we{o,i} 3 - 2 j=i 

one should think of this as an s — 2-fold "derivative" of [a'(b, b', n)]s'Cz 3 ) m the n 
variable. 

From (| 13 . 1 3[) we can write 

[a'(&,6',n)] HS (.z3) = s(s - l)([x(6,&',n, . . . ,ri)] s .(. z a) - [x(&'= b > »V--> «)]a»cz3)) 
+ [/3(6,6 , ,n)] H .(» Z 3 ) 

where /3 is of degree at most s — 3 in n. In fact, by inspection of the derivation 
of (3, and heavy use of Lemma 113.21 one can express [/3(b, V, n)]s s (*z 3 ) as a linear 
combination of classes of the form 

[x(ni, . . .,ra 8 )]s»(*z 3 ) 

where each of n\, . . . , n a is equal to either b, or n, with at most s — 3 copies of n 
occurring. If one then substitutes this expansion into (|13.17[) and applies Lemma 
113.21 repeatedly, one obtains the claim. □ 

On the other hand, from (|13.14[) and Lemma fE. Ill we see that on [[3iV]] s+1 , a 
is equal to a nilsequence of degree $S s — 1, and thus by Lemma IE. 81 

[a{b, b', hi,..., /i a -2,n)]s«([[3JV]]*+i) = 
and thus by Lemma (|13.16|) 

b', hi,..., /is-2)]h=([[3W]] s + 1 ) ~ ix(b', b,hi,..., 7i s _ 2 )] H *([[3;v]]*+ 1 )) = 0. 
Applying Lemma lE.3l we conclude that 

sK[x{h,n, . . . ,n)] E!!(llN]]x[N]) - [x{n,h,n, . . . ,n)] s ->([[N]]x[N])) = 0. 

The claim (|13.7j) now follows from Lemma IE. 131 The proof of Theorem 17.41 is now 
complete. 
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Appendix A. Basic theory of ultralimits 
In this appendix we review the machinery of ultralimits. 

We will assume the existence of a standard universe il which contains all the 
objects and spaces of interest for Theorem 11.31 such as real numbers, subsets of 
real numbers, functions from [N] to C for finite N £ N, nilmanifolds (or more 
precisely, a representative from each equivalence class of nilmanifolds), and so forth. 
The precise construction of this universe is not important, so long as it forms 
a set. We refer to objects and spaces inside the standard universe as standard 
objects and standard spaces, with the latter being sets whose elements are in the 
former category. Thus for instance, elements of N are standard natural numbers, 

oiRj/(oizJisa standard nilmanifold (consisting 

entirely of standard points), and so forth. 

The one technical ingredient we need is the following: 

Lemma A.l (Ultrafilter lemma). There exists a collection p of subsets of the 
natural numbers N with the following properties: 

(i) (Monotonicity) If A G p and B D A, then B e p. 

(ii) (Closure under intersection) If A,B 6 p, then A D B e p. 

(hi) (Maximality) If A C N, then either A e p or N\A E p. but not both. 
(iv) (Non-principality) If A € p, and A' is formed from A by adding or deleting 
finitely many elements to or from A, then A' G p. 

Proof. The collection of subsets of N which are cofinite (i.e. whose complement 
is finite) already obeys the monotonicity, closure under intersection, and non- 
principality properties. Using Zorn's lemm£@, one can enlarge this collection to 
a maximal collection, which then obeys all the required properties. □ 

Throughout the paper, we fix a non-principal ultrafilter p. A property -P(n) 
depending on a natural number n is said to hold for n sufficiently close to p if the 
set of n for which P(n) holds lies in p. 

Once we have fixed this ultrafilter we can define limit objects and spaces as 
follows. 

Definition A. 2 (Limit objects). Given a sequence (xn)neN of standard objects in 
il, we define their ultralimit lim n _ > .p x n to be the equivalence class of all sequences 
(?/n)neN of standard objects in il such that x n — y n for n sufficiently close to p. 
Note that the ultralimit lim n _ s . p x n can also be defined even if x n is only defined 
for n sufficiently close to p. 

An ultralimit of standard natural numbers is known as a limit natural number, 
an ultralimit of standard real numbers is known as a limit real number, etc. 

For any standard object x, we identify x with its own ultralimit lim n _j.p x. Thus, 
every standard natural number is a limit natural number, etc. 



^By using this lemma, our results thus rely on the axiom of choice, which we will of course 
assume throughout this paper. On the other hand, it is tedious but straightforward to rephrase 
the inverse conjecture (Conjecture II. 2t in the language of Peano arithmetic (e.g. using Mal'cev 
bases I48| to represent a nilmanifold, and approximating a Lipschitz function by a piecewisc linear 
one). Applying a famous theorem of Godel |15| , we then conclude that Conjecture II. 21 is provable 
in ZFC if and only if it is provable in ZF. In fact, it is possible (with some effort) to directly 
translate these ultrafilter arguments to a (lengthier) argument in which ultrafilters or the axiom 
of choice is not used. We will not do so here, though, as the translation is quite tedious. 
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Any operation or relation on standard objects can be extended to limit objects 
in the obvious manner. For instance, the sum of two limit real numbers lim n _s. p x n , 
lim n ^ p y n is the limit real number 

lim x n + lim y n = lim x n + y a , 

n— >p n— n— >p 

and the statement lining x n < lim n ^ p y n means that x n < y n for all n sufficiently 
close to p. 

A famous theorem of Los asserts that any statement in first-order logic which is 
true for standard objects is automatically true for limit objects as well. For instance, 
the standard real numbers form an ordered field, and so the limit real numbers do 
also, because the axioms of an ordered field can be phrased in first-order logic. We 
will use this theorem in the sequel without further comment. 

Definition A. 3 (Limit spaces and functions). Let (A n ) n£ N be a sequence of stan- 
dard spaces X n in it indexed by the natural numbers. The ultrapower Yin^p -^n °f 
the X n is defined to be the space of all ultralimits lim n _ s . p 2; n , where x n G A n for 
all n. Note X n only needs to be well-defined for n sufficiently close to p in order 
for the ultraproduct to be well-defined. If A is a set, the set Iln^p^ * s known as 
the ultrapower of X and is denoted *X. Thus for instance *N is the space of all 
limit natural numbers, *R is the space of all limit reals, etc. 

We define a limit set to be an ultraproduct of sets, a limit group to be an 
ultraproduct of groups, a limit finite set to be an ultraproduct of finite sets, and 
so forth. A limit subset of a limit set X = Iln^p -^n ^ s a limit set of the form 
Y = Iliwp ^n> where Y n is a standard subset of X n for all n sufficiently close to p. 

Given a sequence of standard functions /„ : X n — > Y n between standard sets 
X n , Y n , we can form the ultralimit f — lim n ^ p / n to be the function / : Jln^p X n 
Iln^p ^ n defined by the formula 

/(lim x n ) := lim /„(x„). 

n— >-p n— ¥p 

We refer to / as a limit function or limit map, and say that f(x) depends in a limit 
fashion on x. 

Remark. In the nonstandard analysis literature, limit natural numbers are known 
as nonstandard natural numbers, limit sets are known as internal sets, and limit 
functions are known as internal functions. We have chosen the limit terminology 
instead as we believe that it is less confusing and emphasises the role of ultralimits 
in the subject. 

It is important to note that not every subset of a limit set is again a limit set, for 
instance N is not a limit subset of *N (this fact is known as the overspill principle). 
Indeed, one can think of the limit subsets of a limit set as being analogous to the 
measurable subsets of a measure space. In a similar vein, not every function between 
two limit sets is a limit function; in this regard, limit functions are analogous to 
measurable functions. 

Example. (Pigeonhole principle) If X is finite, then *X = X. This is ultimately 
because if the natural numbers is partitioned into finitely many classes, then exactly 
one of those classes lies in p. In particular, we see that every standard finite set is 
a limit finite set. However, the converse is not true. For instance, if N is the limit 
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natural number N := linin-^p n, then the limit set 

[N] := {n G *N : 1 s$ n s$ N} = JJ [n] 

n— yp 

is a limit finite set, but not a finite set. 

Example. One has the identifications *T = (*R)/(*Z) and *(R k ) = (*R) k for any 
standard k, so one can talk about the limit unit circle *T or the limit vector space 
*R fe without ambiguity. We will refer to elements of *T as frequencies. 

Example. Every standard function / : X — > Y can be identified with its ultra- 
limit / : *X — > *Y, thus for instance the fundamental character e is a limit function 
from *R (or *T) to *C, and the fractional part function {} is an limit function from 
*R to *I . 

Remark. A limit finite set A = lim n ^ p A n has an limit cardinality \A\, defined 
by the formula 

\A\ := lim \A n \. 

n— >p 

Of course, \A\ is a limit natural number, and not a natural number in general. Thus 
for instance, if N is a limit natural number, then the limit finite set [N] has a limit 
cardinality of N (despite being uncountable in the standard sense) . 

Asymptotic notation. By taking ultralimits, one can formalise asymptotic 
notation, such as the OQ notation, in a manner that requires no additional quan- 
tifiers. 

Definition A. 4 (Asymptotic notation). A limit complex number X is said to be 
bounded if one has \X\ ^ C for some standard real number C, in which case we 
also write X = 0(1) or \X\ <C 1. More generally, given a limit complex number X 
and limit non-negative number Y, we write \X\ <C Y, 7» \X\, or X = 0(Y) if 
one has \X\ ^ CY for some standard real number C. We write X = o(Y) if one 
has \X\ ^ eY for every standard e > 0. Observe that for any X, Y with Y positive, 
one has either \X\ 3> Y or X = o(Y). We say that X is infinitesimal if X = o(l), 
and unbounded if l/X = o(l). Thus for instance any limit complex number X will 
either be bounded or unbounded. 

In a similar spirit, if x G *V is a limit element of a standard topological space 
V, we say that x is bounded if X is a limit element of standard compact subset K 
of V (i.e. x G *K), and unbounded otherwise. The set of all bounded elements of 
*V will be denoted V. 

Example. The limit real lim n _,.p 1/n defines an infinitesimal, but non-zero, limit 
real number x; its reciprocal lim n ^ p n is an unbounded limit real. 

Example. Any bounded element of a discrete standard space is standard, by our 
example on the pigeonhole principle. In particular, bounded integers are automati- 
cally standard: Z = Z. On the other hand, bounded elements in a continuous space 
need not be standard, as the example lim n ^ p 1/n shows. 

From the Bolzano- Weierstrass theorem, every bounded limit real number can 
be expressed uniquely as the sum of a standard real number and an infinitesimal, 
which may help explain the notation R. Note that R contains the limit fundamental 
domain *Iq. Similarly, C contains the limit unit circle *S' 1 = S 1 = {z G C : \z\ = 1}, 
where S 1 := {z G C : \z\ = 1}. 

Example. For any standard D G N + , we endow C D with the Euclidean norm 

\( Zl ,...,z D )\ :=(N 2 + ... + M 2 ) 1/2 - 
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Then we have C D = C : an element (z\, . . . , zd) G *C D is bounded if and only if 
each component is bounded. 

One modest advantage of the ultralimit framework is that one can rigorously 
work with such equivalence relations as ll x and y differ by O(l)", for instance 
by quotienting *M by the subring R; in the finitary setting, this relation is only 
"morally" an equivalence relation (because of the need to quantify the constants in 
the 0() notation). 

Suppose one has a limit function / : 57 — > *C on a limit set 57. If one asserts that 
f(x) = 0(1) for each x G 57, one may be concerned that this statement provides 
no uniformity in x. However, it turns out such uniformity is automatic for limit 
functions. 

Lemma A. 5 (Automatic uniformity). Let D G N + , and let f : 57 — » *C D be a 

limit function on a limit set 57. Then the following statements are equivalent: 

• (Pointwise boundedness) For each x G 57, one has f(x) G C (i.e. f(x) — 
O(l) for all igO). 

• (Uniform boundedness) There exists a standard real C such that \ f(x)\ ^ C 
for all x £ £1. 

Intuitively, this lemma is asserting that the only types of functions that always 
map unbounded sequences to bounded sequences (but with a bound possibly de- 
pending on the initial sequence) are those functions that are uniformly bounded. 
The lemma can clearly fail if one considers functions / that are not limit functions; 
thus it will be important to establish the limit nature of various functions in the 
arguments below. This lemma is also closely related to the overspill principle in 
nonstandard analysis, or the model-theoretic fact that ultraproducts are countably 
saturated. 

Proof. Clearly uniform boundedness implies pointwise boundedness, so we show 
the converse. Suppose for contradiction that / was pointwise bounded but not 
uniformly bounded. Then for every standard integer M there exists an element xm 
in 57 such that \f(xM)\ > M. 

Write 57 as the ultralimit of standard sets 57 n , write / as an ultralimit of a 
sequence / n : 57 n — > C D , and write xm = XM.n € ^n- Thus for each standard M, 
the statement |/ n (a;M,n)| > M is true for n sufficiently close to p. 

Now we diagonalise. Set y = lim n ^ p y n , where y n := x n n . Then y 6 X and one 
sees that for every standard M, the statement |/ n (?/n)| > M holds for n sufficiently 
close top, thus f(y) is unbounded. But this contradicts pointwise boundedness. □ 

We observe a useful corollary to Lemma IA.5I 

Corollary A. 6 (Automatic uniform lower bounds). Let D G N + , and let f : fi — > 

*<C D be a limit function on a limit set 17 such that \f(x)\ ^> 1 for all x G 57. Then 
there exists a standard c > such that \f(x)\ ^ c for all x G 17. 

Proof. Apply Lemma lA~5l to 1/|/|. □ 

Inspired by Lemma [A.5[ we shall simply call an limit function / : 57 — > *C D 
bounded if it is either pointwise bounded or uniformly bounded. The space of all 

bounded limit functions from 57 to *C D will be denoted L°°(57 — > C ), and we also 
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L°°(f7) = L°°{n -> C) := (J L 00 (f7^C D ). (A.l) 

D6N+ 

When D = 1, L°°(f2 — > C) is a *-algebra over the bounded complex numbers 
C (i.e. it is closed under addition, pointwise multiplication, complex conjugation, 
and multiplication by bounded complex numbers). It is not, however, a limit set. 

For higher dimensions D > 1, we still have the operations of addition, complex 
conjugation (conjugating each coefficient of C D separately), and multiplication by 
bounded complex numbers. However, we do not have a natural product on C D . 
Instead, we will use the tensor product ® : C D x C D — >• C DD , defined in This 
induces a tensor product 

® : L°°(rt -> c D ) xL°°{n^ c D ') -> c DD ') 

for any f2, which is then a bilinear operation on L°°(J7 — > C ). Strictly speak- 
ing, this tensor product is neither commutative nor associative. However, it is 
"essentially" commutative and associative in the following sense. Let us say that 
a function / G L°°(f2 — > C ) is a bounded linear combination of another func- 
tion /' G L°°(n -> C° ) if there exists a linear transformation T : *C £I ' -4 *C D 
with bounded coefficients such that / = To/'. Then it is clear that for any 
/lj f2, fs € L°°(fi — > C ), we have that /a ® /i is a bounded linear combination of 
/i <8> /2, and that /i ® (/2 <8> /s) is a bounded linear combinastion of (/i ® /a) <8> /a- 
This will be a satisfactory substitute for commutativity and associativity for our 
purposes. 

We define the spheres 

S^T : ={zeC D : \z\ = 1} 

and 

5^:= |J = {z e (T : \z\ = 1} 

and observe that is closed under complex conjugation and tensor product, and 
so L°°(n -> 5") is also. Also, observe that for any / G L°°(0 -> S"), 1 is a 
bounded linear combination of / <8> /■ 

When f2 is a non-empty limit finite set (e.g. = [AT] or SI = [N] k for some posi- 
tive limit integer N and some standard k ^ I), we have some additional structures. 

Definition A. 7 (Bias and correlation). Let ft be a non-empty limit finite set. 
Given two functions / G L°°(f2 — >■ C ), g £ L°°(fi — ► C ), we say that / and g 
correlate if one has 

|E*gn/(n) > 1, 

and that / is biased if one has 

|E«en/(n)| » 1, 

i.e. if / correlates with 1. We say that / is unbiased if it is not biased. We define 
the LP norms 

||/|U P(n) := (E nen \f(n)n 1/P 
for 1 ^ p < 00, with the usual convention 

||/IU~(o) := sup l/HI; 
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these are bounded limit non-negative numbers. 

We will also find the following notation useful. 

Definition A. 8 (Density). We say that a limit subset H of a limit finite set X is 
dense if \H\ 3> \X\, and that a statement P{x) is true for many x G X if it is true 
for all x in a dense subset H of X. If instead \H\ = o(\X\), we say that H is a 
sparse subset of X, and if P(x) only holds true for x in a sparse set, we say that 
P(x) only holds for few x G X. If the complement of H in X is sparse, we say that 
if is a co-sparse subset of X, and if P(x) holds for all x in a co-sparse subset, we 
say that P(x) holds for almost all x E X. 

A function / : X — > * ( C D is said to be almost bounded if f(x) G for almost 
all x € X. (For instance, for an unbounded limit natural number N, the function 
n i-> is almost bounded on [N].) 

Remarks. Note that the statement P does not need to be a limit statement (i.e. 
the set {x G X : P(x) true} need not be a limit set) for these definitions to make 
sense; for instance, for P to hold for many x, it suffices that {x G X : P(x) true} 
contain an dense limit subset of X, but need not be a limit set itself. If one property 
P{x) holds for almost all x G X, and another property Q(x) holds for many x G X, 
then P(x) and Q(x) simultaneously hold for many x G X. However, if P only holds 
for many x rather than for almost all x, then it need not be the case that P{x) and 
Q(x) simultaneously hold for any x. 

From the pigeonhole principle we see that if an limit set is partitioned into a 
bounded number of limit pieces, then at least one of the pieces is dense. We can 
strengthen this principle as follows. 

Lemma A. 9 (Pigeonhole principle). Let X be a limit finite set, and let f be an 
almost bounded limit function from X to *N. Then there exists a dense subset of 
X on which f is constant and equal to a standard natural. 

Proof. By hypothesis, / is bounded on almost all of X, and hence uniformly 
bounded on almost all of X by Lemma IA.5I The claim now follows from the 
pigeonhole principle. □ 

We also record here a technical lemma regarding correlation. 

Definition A. 10 (a- limit). A subset S of an limit set X is said to be a a -limit 
set if there is a limit sequence n n- S n from limit natural numbers n G *N of 
limit subsets S n of A", such that S is the union of the S n over all standard natural 
numbers. 

Example. If Q is a limit set and D G N + , then the space L°°(Q — » C°), which 
is an external (i.e. non-limit) subset of the limit space of all limit functions from 
ft to *C D , is a a- limit space, since one can express this space as the union, over all 
standard M, of the functions bounded uniformly in magnitude by M. Similarly, 
L°°(fl — > C ) is also a er-limit set. 

Lemma A. 11 (Limit selection lemma). Let X,Y be limit sets, let R C X x Y be 
a an limit relation between X and Y, and let S be a a-limit subset ofY. Suppose 
that for every x G X there exists s x G S such that (x, s x ) G R. Then there exists a 
limit function x H- s x from X to S such that (x,s x ) G R for all x G X . 
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Remark. The key point here is the limit nature of the assignment i 4 Sj; 
for external (i.e. non-limit) assignments, the claim is immediate from the axiom 
of choice. There is a similar need for such "measurable selection lemmas" in the 
ergodic theory analogue of the inverse conjectures for the Gowers norms, see e.g. 
[36l Appendix A] or Lemma C.4]. 

Proof. We may assume that the sets S n in Definition IA. 101 are increasing in n. 

For each x G X, let n x be the first limit natural number such that (x,s) G R 
for some s G S nal . By construction, x i— > n x is a limit map from X to *N which 
is pointwise bounded. Thus, by Lemma IA.5[ n x is uniformly bounded by some 
standard natural number n*, thus for every x G R the set {s G : (a;, s) G R} is 
non-empty. Applying a limit choice function, we may thus find a limit map x n> s x 
with the stated properties. □ 

We isolate a special case of this lemma. 

Corollary A. 12. Let be a non-empty limit-finite set, Let S be a a-limit subset 
ofL°°(n -> C ), and let (f h ) heH be a limit family of limit functions fh G L°°(f2 — > 
C ) indexed by an limit set H , and suppose that for each h G H , fh correlates with 
an element of S. Then one can find an limit family (4>h)heH of functions cf>h G S 
such that fh correlates with 4>h for all h G H . 

Proof. Write S as the union of limit sets S n for standard n, and let S' :— UneN ^ n u 
{n}. Note that this is a cr-limit subset of — > C ) x *N. Defining a relation 
R between H and S' by declaring (h, (<j), n)) G R if |E„ e n/ft(n) </>(«) | ^ and 
applying Lemma lA. Ill we obtain the claim. □ 

Appendix B. Polynomial algebra 

In section S|6] we introduced the notion of a polynomial map between /-filtered 
groups H and G when the group H was abelian (Definition 16.181) . In this appendix 
we study the more general notion of a polynomial map, no longer restricting to the 
case H abelian. The concept of a polynomial map between groups was introduced 
by Leibman in [121 |H], and here we adapt it to filtered groups. 

Recall the definitions of an ordering / and of an /-filtration of a group G in 
Definitions 16.71 and 16.81 

Definition B.l (Polynomial map). Let G,H be groups with /-filtrations Gi,Hj. 
If g : H — > G is a map then we define the derivative dhg ■ H — > G by the formula 

d h g{n) := 5(/i7i).g(n)" 1 

for all n G H. We say that map g : H — > G is polynomial if one has 

9ht ■■■ d hm g{n) G G il+ ... +lm 

whenever m ^ 0, ii, ■ ■ ■ , i m G /, hj G //: for j = 1, . . . , m and rt G Hq. The space 
of all polynomial maps is denoted poly (Z/r — > Gj). 

Remark. As mentioned in 331 if G or H are written as additive groups instead 
of multiplicative ones, the definition of partial derivative is adjusted appropriately. 

Example 1. If / = N, and H is abelian and is given the filtration Hi = H for 
i = 0,l and Hi — {0} for i > 0, then a map g : H — s- G lies in poly(7/N, Gn) if and 
only if 

d hl ■ ■ ■ d hm g(n) G G m 
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for all m and hi,...,h m G H. This coincides with the definition given in 
[241 Definition 6.1]. Definition EU may be considered as a generalisation of this in 
which the domain group H is allowed to have nontrivial nitrations. 

Example 2. Any map <f> : G — > H between two /-filtered groups which is constant 
and takes values in Hq is polynomial. 

Example 3. If <j> : Hj — s- Gj is a homomorphism of /-filtered groups that maps 
Hi to Gj for each i E I, then is a polynomial map since, for each h £ H, 
dh4> is the constant map n i— > </>(/i). We will call such homomorphisms I -filtered 
homomorphisms from the /-filtered group i/j to the /-filtered group Gj. 

Example 4- If G is an /-filtered group, and g € G, then the left translation 
maps x i— > gx lie in poly(G/ — > Gi). Indeed, the derivative of this map in any 
direction h 6 G; is simply the constant map ghg^ 1 , which lies in Gi, and any 
further derivative of this map is trivial. This example is a special case of the 
Lazard-Leibman theorem ( Corollary IB.4I below), since the translation map is the 
product of a constant map and the identity homomorphism. 

Example 5. Given three /-filtered groups H,G,G', a map g : H — > G x G' is 
polynomial (G x G' is given the product filtration) if and only if its projections to 
G and G' are polynomial. In other words, we have a canonical isomorphism 

poly(i// -> (G x G')i) = poly(i// ->■ Gj) x poly(i/j -)• G,). 

Host-Kra cube GROUPS. There is an important alternative characterisation of 
polynomial maps in terms of Host-Kra cube groups, which we now define. The ma- 
terial in this section is a generalisation of [24], and particularly [24j Proposition 6.5], 
to the context of polynomial maps poly(i// — > Gf) (there matters were discussed 
only in the case poly(iZ — > G/)). The Host-Kra groups are the group-theoretic 
analogue of the Host-Kra spaces of a dynamical system X introduced in [5B"] . 

If to is a natural number, we let 2^ m l be the power set of [m] :— {1, . . . , to}. 

Definition B.2. Let G be an /-filtered group, and let ii,...,i m £ /. We define 
the Host-Kra cube group HK n ' (G/) to be the subgroup of G 2 '" 1 ' generated by 
the elements of the form 

tu>o (<?ia!o ) := (Su)«C[m]! 

where ujq C [to], g Wo £ G^ ;£ ^ , and g w equals g Wo when u> D ujq and is the identity 
otherwise. Thus we see that the l Uo are embeddings of Gj2 j€bj i 3 into HK* 1 '' " ,Jm (G). 
We refer to to as the order of the Host-Kra cube groups, and refer to elements of 
HK* 1 ' "' ,Im (G) as cubes of dimension to and degrees ii,... ,i m . 

Example. Let G be a A;-step nilpotent group, and let Gi = [G, Gi_i] be the lower 
central series filtration. Then HK 1 ' - 1 ^) is the sub group of G 2 generated by 
the "side" elements (glj)ujc[m] where g 1 ^ — g if i £ lo and g 1 ^ = id otherwise, for 
i = 1, . . . , m, and by the diagonal elements (g, . . . , g). 

Theorem B.3. Let G, H be I-filtered groups, and let g : H — > G be a map. Then g 
is a polynomial map if and only if it preserves cubes, in the sense for every m 
and i\, . . . , i m El, the homomorphism g 2<l 1 : i/ 2 ' 1 — > G 2 ' 1 defined by 

ff 2 ' 1 (Wucw) : = (#u))ucH 
maps HK* 1 '-'*"* (i//) to HK il '-' im (G/). 
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Proof. For inductive reasons it is convenient to establish the following slightly 
stronger result. For any mo ^ 0, we say that a map g : H — > G is polynomial 
to order mo if we have 

dh t ■ ■ ■ d hm g{n) e G il+ .,. +im 
for all m with m mo, all i±, . . . , i m £ I, all hj £ Hi j for j — 1, . . . , m, and all 
n £ Hq. It will suffice to show that a map g : i? — > G is polynomial to order mo if 
and only if it preserves the cubes of dimension up to mo . 

We establish this by induction on mo- The case mo = is easy: g is polynomial 
to order if it maps H to Go, but these are also essentially the Host-Kra groups 
of order 0, and the claim follows. Now suppose inductively that mo ^ 1 and that 
the claim has already been shown for all smaller values of mo. 

Suppose first that g : H —> G preserves all cubes of dimension up to m . Then 
by the preceding discussion, g maps H to Go- To show that g is polynomial to 
order m , it thus suffices to show that for every i 6 I and h € Hi, d^g is polynomial 
to order mo — 1 in the shifted /-filtration Gf 1 defined by 

G+* := (G s+i ) jeI . (B.l) 

By the induction hypothesis, it suffices to show that dhg preserves cubes of dimen- 
sion m — 1. Accordingly, let h = (ft w )o;c[mo-i] De an element of HK n '"""' lm °~ 1 (H). 
We may view (h,h ■ h) as an element of HK il ' "' lmo_1 ' l (i?) of one higher order, 
where h ■ h :— (/i/i it i) ( »;crmo— l]' ^v hypothesis on g, we have 

{g 2lmo ~ 1] (h) ig 2lmo ~ 1] {h-h)) e HK il ""' i "'o- 1 ' i (G). 

An inspection of Definition [H2] reveals that (51,52) lies in HK* 1 ''"' lmD_1 ' l (G) if and 
only if 51 lies in HK* 1 ' '^" 1 (G) and (^(ffi)" 1 lies in HK il '"' im °- 1 (G, Gf) (which 
is easily seen to be a normal subgroup of HK 11 ' "' lm ° _1 (G)). We conclude that 

g 2lm °~ 1] {h- h) ■ g 2lm °~ 1] (h)- 1 e EK iu -> im °- 1 (G,Gp). 

But 

/ m °- 1] (/ l .H). 5 2[mo - 11 (M- 1 = (^ 5 ) 2[m °- 11 (H), 

and the claim follows. 

Next, suppose conversely that g : H — > G is a polynomial map of order up to 
mo; by the inductive hypothesis, it suffices to show that g preserves all the cubes of 
dimension exactly mo- Accordingly, let h be an element of HK 21 ' (iJ) of this 
dimension. Arguing as before, we may write 

h = (Jii,hih\) 

where h x £ HK h '~> im °-' L (H) and h 2 & EK il -"' im °- 1 (H,Hp n °). Our objective is 
then to show that 

g* lmo \h) = (g 2lm °-\h 1 ),g* [mo -\h 2 h 1 )) 

lies in HK il, '"' im ° (G). By the decomposition of EK h '-' im ° (G), it thus suffices to 
show that 

g 2lm °~ 1] (/^i)/™ " 11 (h)- 1 GHK* 1 4 -«- 1 (G,G} lm "). (B.2) 

Recall that HK ll '"'' lm °~ 1 (H, H I m ° ) is generated by elements of the form l Uq (h Uo ), 
where ujq C [mo — 1] and h Ua G Hj^ , g ij+i mo - telescoping series, we thus see 
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that to establish the above claim it suffices to do so under the additional assumption 
that hi is a generator 

for some oj C [m - 1] and h Uo G Hj2 jeul ij+i mo - 

By relabeling we may assume that loq — {m! + 1, . . . , mo — 1} for some ^ ml ^ 
mo — 1. The left-hand side of (|B.2I) then simplifies to 

(^ <?) 2lm,, (^), (B.3) 

where h[ is the restriction of hi to 2^ n \ and we embed G 2 ' 1 into G 2 ' ° 1 by 
identifying (g w ) acM with the tuple (<7w)wc[m -i]> where g u is equal to g w n[m>] 
when uj contains f2, and is equal to the identity otherwise. 

But by induction hypothesis, dH3} lies in HK' lv " ,,m ° _1 (G, G[ ^ J ' e "° ^ +Im ") By 

Definition El this embeds into HK il; -''" , »" 1 (G,G ; +l '""), giving (jB~2|) as desired, 
and the claim follows. □ 

Theorem IB . 31 has two immediate corollaries. 

Corollary B.4 (Lazard-Leibman theorem). Let G,H be I -filtered groups. Then 
poly(/// — » G/) is a/so a group {using pointwise multiplication as a group opera- 
tion). 

Corollary B.5 (Composition). Let G, H, K be I -filtered groups. Lf g G po\y(Hj — > 
Gj) and h G poly(X/ —s- Hj), then g o h G poly(//f — >• /Cr). 

In other words, for any fixed /, the class of /-filtered groups together with their 
polynomial maps form a category. It is remarkably difficult to establish Corollary 
IB. 51 in full generality without the machinery of Host-Kra cube groups. 

Example. If G, H are /-filtered groups with H = (H, +) abelian, and g is a 
polynomial map from H to G, then the translates g(- + h) and dilates g(q-) for 
h G H and q G Z are also polynomial maps from H to G, thanks to Corollary IB.5I 
and Examples 3 and 4 following Definition IB. II More generally, if <\> : H' — > H is a 
filtered homomorphism and g G poly(//f — > Gj), then g o <f) e poly (Hj — >■ Gj). 

Example. Using Corollary IB. 41 we can establish that any algebraic word u> on fc 
generators defines a polynomial map from H k to // for any /-filtered group H. For 
instance, the map (g, h) — » g 2 h~ 3 gh is a polynomial map from H x H to H. 

We can strengthen Corollarv lBTI slightlv. by giving poly(/// — > Gi) the structure 
of an /-filtered group: 

Proposition B.6 (Filtered Lazard-Leibman theorem). Let (G, Gi), (H,Hj) be I- 
filtered groups. Then poly(/// — > G/) is also an L-filtered group, with filtration 
(poly(/// — > Gj ))iei, where the shifted filtration G\ l was defined in (|B.1[) . In 
particular, the poly(/// — > Gj*) are normal subgroups of poly (Hj — > Gj). 

Proof. The only non-trivial claim to show is that if gi G poly (/// — > G^') and 
gj G poly(// 7 -> G] J ) for some i,j G /, then [g h gj] G poly(/// -> G+ l+J ). It 
suffices to show for each mo 5^ that if gi, gj are polynomial maps up to order mo 
from (H,Hj) to (G, G+ l ), (G,G~^ J ) respectively, then [gi,gj] is a polynomial map 
up to order mo from (//,///) to {G,G^ l+1 ). 

Again we induct on mo- The case mo = is trivial, so suppose that mo ^ 1 and 
that the claim has already been proven for smaller values of mo- 
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As gi,gj map Hq to Gi, Gj respectively, [gi,gj] maps Hq to Gi+j. It thus suffices 
to show that for each k £ I and h G Hk, that dh[gi,gj] is a polynomial map up to 
order m — 1 from (H, Hj) to (G, Gf l+J+ ). But a brief calculation shows that 

dh[gi,gj] = gi 1 {d h g i y l gj 1 {d h g j y 1 {d h g l )g i {d h g j )gl 1 g j g i . (B.4) 

By induction hypothesis (and Corollary IB. 41) . the maps that are polynomial up to 
order m — 1 from (H,Hj) to (G,G^ l+ ^ +k ) form a normal subgroup of the maps 
that are polynomial up to order too — 1 from (H,Hi) to (G, G/). If we quotient 
out by this normal subgroup, then a further application of the induction hypothesis 
shows that dugi commutes with gj and dugj, and that gi commutes with dhgj- An 
inspection of (|B.4[) then shows that the right-hand side vanishes once one quotients 
out by this normal subgroup, and the claim follows. □ 

Proposition IB. 61 has some useful corollaries: 

Corollary B.7 (Approximate linearity and commutativity) . Let G,H be I -filtered 
groups, let i,j,k,l G I, and let gi € poly(i// — > G~j~ l ), gj G poly(if/ — > G~^ J ), 
hk G Hk, and hi G Hi. Then we have 

dh k (m) = (d hk9i )(d hk9j ) mod poly( J ff / -> G+ i+j+k ) (B.5) 

and 

d hkhl (gi) = {d hk gi){d higi ) mod poly(ffj -> G+ 4+fc+z ). (B.6) 
// H is abelian, we also have 

{d hl 9i){d hk9i ) = {d hh g i ){d hl g i ) mod poly(F / G+ 4+fc+ '). (B.7) 

Proof. The conclusions (|B.5I) , (|B.6|) follow from Proposition IB. 61 and the identities 

dh k {gigj) = {dh k g l )(dh k g J )[dh k g J ,gr 1 } 

and 

dh k h,(gt) = {dh l d hk g l )(d hk g i )(d hl g l ). (B.8) 
The identity (IB.7|) then follows by swapping the roles of hk and hi in (|B.6I) . □ 

Next, we make the useful observation that in order to check polynomiality of a 
map, it suffices to do so on generators. 

Proposition B.8 (Checking polynomiality on generators). Let G,H be L-filtered 
groups. For each i G / , let Ei be a set of generators for Hi. Then a map g : H — >• G 
is polynomial if and only if one has 

d hl ...d hm g{n) G G il+ ...+ im (B.9) 

for all to 0, all i\, . . . , i rn G I, and all hj G for j = 1, . . . , m, and all n G Hq. 

Proof. The "only if" part is trivial, so it suffices to prove the "if" part. For inductive 
reasons, we shall prove the following more general statement: if I, mo 0, and 
g : H — > G is such that dh 1 ...dh m g is a polynomial map up to order / from 
(H, ifj) to (G, Gf tl+ " whenever < to < to , i\, . . . ,i m G / and frj G E^. for 
j = 1, . . . , m, then g is a polynomial map from H to G up to order mo + /. Indeed, 
by setting I = and sending mo — > oo we obtain the claim. 

We establish the claim by induction on m. The case too = is trivial, so suppose 
that mo ^ 1 and that the claim has already been proven for smaller values of Too- 

Fix I. Let 1 to too and i\,...,i m G /, and suppose that hj G E^, for 
j = 2, . . . , m, and write g := dh 2 ■ ■ ■ dh m g- By hypothesis, we have that d^g is 
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a polynomial map of order I from (H,Hi) to (G,Gj m ) whenever hi lies in 

E h . Using (|B~8|) and Corollary EH we conclude the same statement holds when 
hi lies in . Also, by induction hypothesis g is also known to be a polynomial 
map of order I from (H,Hi) to (G,G~^ % +lm ). We conclude that g is in fact 
a polynomial map of order I + 1 from (H,Hj) to (G, G^ l2+ "' +lm ). Applying the 
induction hypothesis again, we conclude that g is a polynomial map of order I + m 
from H to G, as required. □ 

Example. Let Gi, G2, G be /-filtered groups, and let B : Gi x G2 — > G be a map 
which is "bilinear" in the sense that the maps gi <— > B( .91,32) for fixed 32 £ G2 
and 32 l— ► -£"($11.92) for fixed .91 £ Gi are homomorphisms, and such that B maps 
Gi.^i x G2.^j to Gi+j for any i,j G /. Then £? is a polynomial map, as can be 
seen by using Proposition IB . 81 with G\^i x {id} U {id} x G2,^i as the generating set 
for (Gi x G2)i = Gi.^i x G2,^i- Combining this with Corollary IB. 51 we conclude 
in particular that if H is an /-filtered group and gi G poly (Hi — > (Gi)/), 52 £ 
poly(i// — > (G2)/), then B(gi,g 2 ) G poly(i// — > G/); informally, this is asserting 
that the product of polynomials is again a polynomial. 

Example. Let G be an N fc -filtered group, and let g G poly(Z^ fc — > Gjqk) be 
a polynomial sequence, in which Z fe is given the multidegree filtration. We can 
collapse the N fc -filtration on G to an N-filtration by defining Gi to be the group 
generated by G(j 1: ... iifc ) for all (ii,...,i k ) G N fc with ii + . . . + i k = i. From 
Proposition IB. 81 we thus conclude that g remains a polynomial map from Z k to G 
if we now give Z k the de gree filtration, and give G the N-filtration indicated above. 

The next lemma describes a useful type of Taylor expansion for polynomial 
sequences. 

Lemma B.9 (Taylor expansion). Let d 1 be a natural number, let G be an N d - 
filtered group of degree C J for some finite downset J, and let g G poly (Z^ d — > G^<i ), 
where 1 d is given the multidegree filtration. We complete the partial ordering on 
J to a total ordering in some arbitrary fashion. Then there exist unique Taylor 
coefficients gj G Gj for each j G J such that 

Here we adopt the notational convention 

/(ni , . . . , n d )\ _ /raA / n d \ 

\ ih, ■ • -Jd) ) ' \.hj "' \3d)' 

Proof. We first show uniqueness. Suppose that we have two Taylor expansions that 
agree everywhere, that is to say 

je.J je.J 

for all n G 7L d . Setting n = we see that go = g' . Cancelling this, we see that 

n o 11 c*}A 

.H -i-j jeJ--j>o 



AN INVERSE THEOREM FOR THE GOWERS U a + 1 [JVJ-NORM 



SO 



More generally, suppose inductively that we have shown that <?j = g'j for all j ^ jo 
and 

II sP= II 

je.J:j>jo jeJ--j>jo 

for all n £ Z d some jo € J. If jo is the maximal element of J then we are done. 
Otherwise, let j\ be the next element after jo in the total ordering of J. Setting 
n = ji we conclude that gj 1 = gj' , and then we can continue the induction and 
establish uniqueness. 

Now we show existence by inducting on the cardinality of J. The claim is trivial 
for J empty, so suppose that J is non-empty, and let j* be the maximal element of 
J. The group Gj, is a central subgroup of G; if we quotient G by Gj, , we obtain 
an N d -filtered group G/Gj, of degree C J\{j*}. Let tt : G -> Gj, be the quotient 
map. Applying the induction hypothesis, we have a Taylor expansion 

T(ff(n))= n h P 

for some hj € 7r(Gj). Writing hj = ir(gj) for some <7j G Gj, and using the central 
nature of Gj, , we conclude that 

9(n) = ( I] spVw 

for some <7'(n) taking values in Gj, . By Corollary IB .41 g' is a polynomial sequence, 
and therefore 

di\...dilg'(n)= id 

whenever (ji, . . . ,jk) £ j*, with ei, . . . , e/. being the basis of Z fc . We can "integrate" 
this difference equation repeatedly using the abelian nature of Gj, (and the Pascal's 
triangle relation <9 ei (j+ e .) = (")) and conclude that 

for some g'j € Gj, . Using the central nature of Gj, , we conclude that 

s(n) = IIW C) 

(with the convention that gj, = id) and the claim follows. □ 

Corollary B.10 (Pullback). Let d ^ 1 be a natural number, let G be an N d - 

filtered group of degree C J for some finite J, and let g G poly(Z^ d — > G N d). 
Let (j) : G' — > G be a N d -filtered homomorphism of N d -filtered groups such that 
<f> : Gj — > Gj is surjective for every j. Then there exists g' G poly(Z^ d — > G' Nd ) 
such that g — g' o (j). 



Proof. Apply Lemma lB.9l and then pull back each of the resulting Taylor coefficients 
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Appendix C. Lifting linear nilsequences to polynomial ones 

The purpose of this appendix is to demonstrate the equivalence of the linear 
inverse conjecture, Conjecture 11.21 with the polynomial inverse conjecture, Con- 
jecture 14.51 We remind the reader that this is not strictly speaking necessary to 
establish the results in [23] , but the latter paper was written before the more general 
notion of a polynomial nilsequence came to the fore. 

The key observation here is that every polynomial nilsequence of degree $J s can 
be "lifted" to an s-step linear nilsequence in a certain sense. 

We begin by recording a useful lemma. 

Lemma C.l (Discrete polynomials are cocompact). Let G/T be an N -filtered nil- 
manifold. Then poly(Zpj — > Tpj) is a lattice (i.e. a discrete cocompact subgroup) of 
poly(ZN — > Gr) (where we give Z the degree filtration). 

Proof. We may assume that G/T has degree-rank ^ d. It will suffice to show that 
any polynomial sequence g £ poly(ZN — > Gn) can be factorised as g — jg' where 
7 G poly(ZN — > Tn) and g' ranges in a compact subset of poly(ZN — > Gn). It is 
enough to show by induction on i that for every ^ i ^ d + 1, there exists a 
factorisation g = 'jihig'i where 7' G poly(Zfj — > Ln), g[ lies in a compact subset of 
poly(Z — > G), and hi G poly(ZN — > Gn) is such that h(0) = . . . = h(i — 1) = id, 
since for i = d + 1 this forces h to be trivial. 

This inductive claim is trivial for i = (setting 70 = g' a to be trivial). Now 
suppose inductively that one has a factorisation g — ^ihig[ for some ^ i ^ d. 
Since h(0) = . . . = h(i — 1) = id, we see from Taylor expansion that h(i) G G;. 
Since Ti := T n Gi is cocompact in Gi, we may factorise h(i) — 7i+i(*)<?i+i(*) for 
some 7i + i(i) G Ti and g' i+1 (i) in a cocompact subset of Gi. By Taylor expansion 
we may extend 7i+i, g' i+1 to elements of poly(Zf< — > I~n) and of a compact subset of 
poly(ZN — > Gfi) respectively which are trivial on 0, . . . , %— 1. Writing 7^+1 :— 7,7^+1, 
h i+ i := 7j^ 1 /ii(5i + i) _1 , and g' i+1 := g' l+1 g[ we obtain the claim. □ 

Now we establish the key lifting proposition. 

Proposition C.2 (Polynomial nilsequences can be lifted to linear ones). Let G/T 
be a filtered nilmanifold of degree ^ s. Then there exists a standard s-step nilman- 
ifold G/T, a standard compact subset K of G/T, and a standard Lipschitz map 
7r : K — > G/T, such that for every (standard) polynomial sequence g : Z — s> G, there 
exists g G G and x G G/T such that g n x G K and g(n)*T = Tr(g n x) for all n G Z. 

Indeed, with this proposition, any degree ^ s nilsequence n 1— > F(g(n)*T) can 
then be lifted to an s-step linear nilsequence n 1— > (F o ir)(g n x) with g G *G and 
x G *(G/r), where F o n is extended from a Lipschitz function on if to a Lipschitz 
function on G/T in some arbitrary fashion. From this one easily concludes that 
Conjecture 11.21 follows from Conjecture 14.51 (The converse implication is trivial, 
because every linear nilsequence is a polynomial nilsequence.) 

To motivate Proposition lC.2l lct us present an illustrative example. We take s = 2 
and G/T to be the unit circle R/Z with the quadratic filtration (thus Gi equals R 
for i ^ 2 and {0} for i > 2). By Remark 19.61 a polynomial sequence g : Z — ► G 
then takes the form g(n) = cto + ct\ (") + 02(2) f° r some frequencies ao, «i, «2 (i-c. 
a non-standard classical quadratic polynomial). To lift this quadratic sequence to 
a linear one, we introduce use the Heisenberg nilmanifold G/T (Example 16. lj) . and 
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place inside it the skew torus 

K := {g t 1 1 [g 1 ,g 2 } t ^T :t u t [h2] el}. 

This is easily seen to be compact (indeed, it is topologically equivalent to T 2 ). We 
define the map n : K — > T by the formula 

*{9i\9\M HlA ) :=*[i,2] mod 1; 
it is easy to see that ir is well-defined and smooth. If we set 

9 ■= 9im[gi,g2f\ x ■=[g 1 ,g 2 ] 1 f 

for some frequencies a, /3, 7 € M, then a brief calculation shows that for any integer 
n, g n x lies in K and 

n(g x) — — -a + np + 7 mod 1, 

and so one can arrange for Tr(g n x) — gin) by choosing a, /3, 7 appropriately in terms 
of «0j ct\,a 2 . 

The above construction was ad hoc in nature, requiring one to conjure up the 
Heisenberg group out of thin air. However, it is possible to canonically construct a 
lifted nilmanifold G/T in the general case. Fix G/T. By Remark 19.61 poly(ZN — > 
Gn) is a Lie group topologically isomorphic to Jli^o but with a different group 
structure. Since G has degree < s + 1, we see that G is ^ s-step nilpotent, which 
implies that poly(ZN — ► Gn) is ^ s-step nilpotent also. 

Let Fn be the restriction of the filtration Gn to F (Example 1 6 ■ 14p . thus L is now 
a filtered group. By Lemma IC.ll poly(Zpj — ► Tn) has the structure of an s-step 
nilmanifold. This is not yet the nilmanifold G/T needed for Proposition IC. 21 but 
we can modify it as follows. We observe that there is a shift automorphism T acting 
on both poly(Zrj — > Gn) and poly(Zpj — >■ Fn) by the formula Tg(n) := g(n + 1). It 
also acts on the Lie algebra log poly (Zn — » Gn) of poly(Zfj — > Gn), which by abuse 
of notation we shall call poly(ZN — > log Gn). This action is unipotent; indeed, T—l 
maps poly(Z N -)• logG^) to poly(Z N -> logGj (i+1) ) for all i ^ 0, where G +i is G 
with the shifted filtration G^ 1 :— Gd+i- The conjugation action of poly(ZN — > Gn) 
on poly(ZN logGN) has the same unipotence property by the filtered nature 
of G. Because of this, we see that the conjugation action of semi-direct produclQ 
poly(ZN — > Gn) XtZ on poly(ZN — > logGN) is s-step unipotent, which implies that 
poly(ZN — > Gn) xt Z is s-step nilpotent. 

Unfortunately, the group poly(ZN —5- Gn) xt Z is not connected, so it is not 
directly suitable for the purposes of establishing Proposition IC.21 But this can be 
easily remedied by using the unipotent nature of the action of T on poly(ZN — > 
log Gn) to expresto T = T 1 for some smooth unipotent group action t n> T* of the 
real line M on poly(ZN logGN), which can then be exponentiated to provide a 
unipotent group action (which we will also call t i-> T*) on poly(ZN — > Gn). The 
action of the group G := poly(ZN — > Gn) x t R on poly(ZN — > logGN) is then s-step 
unipotent, which implies that G is s-step nilpotent. 



^Notc that Z is viewed as an additive group, while poly(Zpj — > Gjq) is viewed as a multiplicative 
group; we hope that this will not cause confusion. 

^This can also be done by the machinery of Mal'cev bases for both discrete and continuous 
nilpotent groups, see [45] . 
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The group G is an s-step nilpotent Lie group which is both connected and simply 
connected. It contains the discrete subgroup T :— poly(ZN — > Fn) X t Z. Since 
poly(Zpj — > Tfj) is cocompact in poly(Zpf — > Gn) (and Z is cocompact in R), G is 
cocompact in G; thus G/T has the structure of a nilmanifold. 

There is a canonical map 9 from G/T to T induced by the projections of G, T to 
R and Z respectively. We denote the kernel # _1 ({0}) of this map by K, thus K is 
a compact subset of G, T. Observe that every element of K can be represented as 
(g, 0)r for some g G poly(ZN — > Gn), which is unique up to multiplication on the 
right by poly(ZN — > Tn). We then define the map ir : K — > G/T by the formula 
n(g) := g(0)T; it is clear that ir is a Lipschitz continuous map. 

We are now ready to establish Proposition IC.2I Let g G poly(ZN — )• Gn), then 
we set x :— (g, 0)T G K and g :— (id, 1) G G. One easily verifies that for any integer 
n, g n x — (T n g,0)T G K, and so ir(g n x) = g{n). Proposition [C]2] follows. 

Appendix D. Equidistribution theory 

The purpose of this appendix develop the quantative Ratner-type equidistribu- 
tion theory for nilmanifolds, which will help us determine when averages such as 

E ne[N] F(0(n)) (D.l) 

are large, for various nilsequences n n- F(0(n)). We will also need a multidimen- 
sional versio of this theory, in which [N] is replaced with [N) k , or more generally 
by the Cartesian product of k arithmetic progressions. 

This theory is based on the results |24] on equidistribution in nilmanifolds, trans- 
lated to the language of ultralimits. The results in this appendix will be needed in 
two places. Firstly, Theorem ID. 61 below, which gives a criterion for when averages 
such as (|D.1[) are large, will be used in CTT1 to analyse the correlation property 
arising from Proposition 17.31 Secondly, Theorem ID. 51 which (locally) factorises 
an arbitrary multidimensional polynomial orbit into equidistributed and smooth 
pieces, will be used to give an important criterion for when a nilcharacter is biased 
(see Lemma fE. lip . 

We begin with some basic definitions. 

Definition D.l (Equidistribution). Let G/T be a standard nilmanifold, which then 
admits a canonical Haar probability measure fj,. Let f2 be a non-empty limit finite 
set, and let O : il —> *(G/T) be a limit function. We say that O is equidistributed 
in G/r if, for every F G Lip(G/r), one has 

1^60^(0(10)= / Fdn + o(l), (D.2) 
JG/r 

or equivalently if n F(0(n)) is unbiased on f2 whenever f G / r F d[i = 0. 

Now we specialise to the case f2 = [N] k . We say that O is totally equidistributed 
on [N] k if it is equidistributed on every product Pi X ... X Pfe of dense arithmetic 
progressions Pi , . . . , P& in [N] , thus 

E nePlX ... xPk P(0(n)) = / Fdn + o(l) (D.3) 
J G/r 

for every standard Lipschitz function F : G/T — » C. 

^On the other hand, we will however only need to work with the degree filtration, although 
it is certain that the theory here would extend to /-filtered nilsequences for other orderings. 
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Remark. We defined equidistribution using standard Lipschitz functions F £ 
Lip(G/r), but the statement (|D.2j) for F £ Lip(G/r) automatically implies the 
same claim for F £ Lip(*(G/r)). 

This notion of equidistribution on [N] is closely related to, but not identical with, 
the more classical notion of equidistribution involving an infinite sequence j:Z-> 
G, in which one takes a limit as N — ¥ oo; we refer to this latter concept as asymptotic 
equidistribution in order to distinguish it from the "single-scale" equidistribution 
considered here, in which one is working with a fixed (but unbounded) N. While 
there is a close analogy between the theory of asymptotic equidistribution and 
single-scale equidistribution, there does not seem to be a soft way to automatically 
transfer results from the former to the latter. Single-scale equidistribution is in fact 
much closer to the notion of 5 -equidistribution studied for instance in |24j : we refer 
readers to that paper for further discussion of the distinction between the different 
types of equidistribution. 

Example. We consider the case when G/T = T d is a torus. Weyl's equidistri- 
bution criterion, in our notation, then asserts that an limit map O : [N] k — > T d is 
equidistributed if and only if one has 

E neW e(£-0(rc))=o(l) 

for all standard £ £ Z d \{0}. One can also show (using some Fourier analysis) that 
O will be totally equidistributed if and only if 

E„e[jv]*e(£ • 0{n))e(rj ■ n) = o(l) 

for all standard £ £ Z d \{0} and rj £ Z k . As a consequence of this and some 
further Fourier analysis, we see that a one-dimensional linear orbit O : [N] — > T d 
defined by 0(n) := an + (3 for some a, f3 £ T d will be equidistributed or totally 
equidistributed in T d if and only if a is not of the form q + 0(N~ 1 ) mod 1 for some 
standard rational q £ Q. 

Given a standard filtered nilmanifold G/T, a horizontal character is a continuous 
standard homomorphism £ : G — > T which vanishes on T. We say that the character 
is non-trivial if it is not identically zero. 

We have the following basic equidistribution criterion, generalising the torus 
example above. 

Theorem D.2 (Leibman theorem). Let k £ N + , let N be an unbounded natural 
number, let G/T be an N-filtered nilmanifold, and let O £ *poly(Z^ — > (G/r)pj) be 
a k-dimensional polynomial orbit, where Z fe is given the degree filtration. Then on 
[N] k , the following statements are equivalent: 

(i) O is totally equidistributed in the nilmanifold G/T; 

(ii) O is equidistributed in the nilmanifold G/T ; 

(iii) O is equidistributed in in the torus G/([G,G]T), and 

(iv) There does not exist any non-trivial horizontal character £ such that £ o g 
is Lipschitz with constant 0(1/N). 

Proof. See [23J Theorems 1.19, 2.9, 8.6] (where in fact a more quantitative strength- 
ening of this equivalence is established) . The analogue of this result for asymptotic 
equidistribution was established previously in 46} (and the result is classical in the 
case of linear sequences) . The main difficulty is to show that (iv) implies (ii) , which 
is the main content of 24, Theorem 2.9], which relies primarily on a certain van 
der Corput type equidistribution lemma for nilmanifolds. □ 
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Theorem ID. 21 implies the following weak factorisation theorem. 

Theorem D.3 (Weak factorisation theorem). Let k G N + . let N be an unbounded 
natural number, let G/T be an N -filtered nilmanifold and let g G *poly(Z^ — > Gpj). 
Suppose that g is not totally equidistributed on [N] in G/T. Then one can factorise 
g = eg' 7, where £, </, 7 G *poly(Z& — > Gn) have the following properties: 

• e is a bounded sequence on [N] k with the i th Taylor coefficient of size 
O(N-W) for each i G N k ; 

• g' takes values in a standard proper rational subgroup G' of G (i.e. G' is a 
connected proper Lie subgroup of G, and T' := G' n T is cocompact in G). 

• 7 is periodic modulo T with a standard period q G N + ; thus 7(71 + qv) = 
7(71) mod T for all n,v G *Z fc . Furthermore. 7 takes values in a standard 
subgroup T of G which contains T as a subgroup. 

Proof. See [Ml Proposition 9.2]. The basic idea is to use the non-trivial horizontal 
character £ generated by Theorem lD.2l to cut out the subgroup G' . In order to keep 
G' connected, one needs to first factorise £ = m£' where m is a standard positive 
integer and £' is an irreducible horizontal nilcharacter; this integer m is responsible 
for the periodic term 7. □ 

One can iterate this to obtain a "Ratner-type" theorem. 

Theorem D.4 (Factorisation theorem). Let k G N + , let N be an unbounded natural 
number, let G/T be a (filtered) nilmanifold, and let g G *poly(Z^ — > Gn). Then 
there exists a standard rational subgroup G' of G (i.e. G' is connected and G' PI T 
is cocompact in G) and a factorisation 

g(n) = e(n)g'(n)j(n) 

where e,g', 7 G *poly(Z^ — > Gn) have the following additional properties: 

• e is a bounded sequence with the i th Taylor coefficient of size 0(N-W) for 
each i G N fc , and has Lipschitz constant 0(1/N); 

• g' takes values in a standard proper rational subgroup G' of G, and is 
totally equidistributed in G' /T' whenever T' is any standard subgroup of 
G'nT of finite (standard) index. 

• 7 is periodic modulo T with a standard period, and takes values in a stan- 
dard discrete subgroup T of G which contains T. 

This theorem is a close relative of [MJ Theorem 1.19], and can be proven by the 
same methods; for the convenience of the reader we sketch a proof here. 

Proof. Let us say that g can be represented using a standard rational subgroup 
G' of G if one has a factorisation g = eg'^f which obeys all the conclusions of the 
theorem except for the total equidistribution of g' . Clearly, g can be represented 
using G itself, by setting e and 7 to be the identity and g' :— g. By the principle 
of infinite descend (using the fact that G has a finite standard dimension), we 
may thus find a standard rational subgroup G' which represents g, and is minimal 



The ability to use this principle is an advantage of the ultralimit setting. In the finitary 
setting, in which one needs to quantify such concepts as total equidistribution, periodicity, etc., 
one has to instead perform an iterative "dimension reduction argument" which requires one to 
manage many more parameters; see 1241 for an example of this. See also the beginning of £|10l for 
a related discussion. 
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in the sense that no proper standard rational subgroup of G represents g. Let 
g = eg' j be the associated factorisation. It then suffices to show that g' is totally 
cquidistributed in G' /V for every standard finite index subgroup V of T n G' . 

Suppose for contradiction that this is not the case. Applying Theorem ID. 3[ one 
can factorise g' = e"g"j" where e" is a bounded sequence with Lipschitz constant 
0(1/N), 7" is periodic with a standard period, and takes values in a standard 
discrete subgroup V that contains V , and g" takes values in a proper rational 
subgroup G" of G' . One can enlarge T' to contain T, and this is easily verified to 
still be discrete. One can then show that the factorisation g = (ee")g" (j"j) is a 
representation of g using G" (see [2H §10] f° r details), contradicting the minimality 
ofG". ' □ 

It will be convenient to convert the factorisation in Theorem ID. 41 into a more 
convenient form, eliminating the periodic factor 7 and the slowly varying factor e 
by passing to subprogressions. 

Theorem D.5 (Factorisation theorem, II). Let k € N + . let N be an unbounded 
natural number and let O G *poly(Z^ — > (G/T)j$). Then one can partition [N] 
into a bounded number of products P = P\ X . . . of dense arithmetic subprogres- 
sions of [N], such that for each P one has a polynomial e G *poly(Z^ — > Gn) which 
is bounded with Lipschitz constant 0(1/ N) on P and with the i th Taylor coefficient 
of size 0(N~\ l \) for each i, a standard rational subgroup Gp of G, and a polyno- 
mial sequence gp G *poly(Z^ — > (Gp)j$) totally equidistributed on Gp/Tp where 
(Gp)n := (Gp fl Gi)i £ N and Tp := Gp n T, such that 

O(n) = e P (n)gp(nyV 

for all n G P. Furthermore, for each i G N fe , the horizontal Taylor coefficients 
Taylor^g) and Taylor^gp) differ by 0(iV~' 4 '). Finally, for two different products 
P,P' of progressions in this partition of [N] k , the sequences gp and gp> are con- 
jugate, with gpi = 7p P ,gp^fp.p' for some "fp.p> G G which is rational in the sense 
that "fppi G T for some bounded positive integer m. 

Proof. Write 0(n) = g(n)*Y for some g G *poly(Z^ Gs)- We apply Theorem 
ID. 41 to obtain a rational standard subgroup G and a factorisation g = eg'j with 
the stated properties. The sequence 7 is periodic with a standard period, so we 
may partition [N] k into a bounded number of products P = P\ x . . . x P). of dense 
arithmetic subprogressions of [N] on which 7 = "fp is constant. As T is cocompact, 
we may thus find 7p G 7pT which is bounded, thus Y P = 0(1). Note that j'p 
lives in a discrete group T and is thus standard. Since T is cocompact, it has finite 
index in T, which implies that 7^ is rational, or equivalently that "f' P has rational 
coefficients with respect to a Mal'cev basis [48] of G/T. 
For n G P, we can write 

O(n) - £ (n) 3 '(n)7 P r = e(n)i P gp(n)*T 

where gp(n) :— (jp)~ 1 g'(n)j P is the conjugate of g'(n) by j' p . Note that this gives 
the claim about the conjugate nature of gp and gp>. 

As 7 P is rational, the conjugate 7pF(7 P ) _1 intersects T in a subgroup T' of finite 
index, which then has the property that T'j'p C J P T. From this, we see that the 
conjugation operation g H> ("f P )~ 1 gj' P on G descends to a continuous projection of 
G/T 1 to G/r, which maps g(n)*V to gp(n)*T. Since g(n) is totally equidistributed 
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on G'/(G' n r') by construction, we conclude that gp is totally equidistributed on 
Gp/(Gp n r), where G P := (Yp^G'-f'p is the conjugate of G". Note that G P is 
also a standard rational subgroup of G. If we now set ep :— ej'p, we obtain all the 
claims except for the one about horizontalTaylor coefficients. But from the remarks 
following Definition 19.61 and the factorisations g = eg'^f, gp = (Yp)~ 1 g'~fp we have 

Taylor 4 ( 5 ) = Taylor, (e) Taylor^') Taylor, (7) 

and 

Taylor,^) = Taylor^'). 

Since 7 takes values in T, Taylor i (7) vanishes. Finally, by construction we have 
Taylor^e) = 0(N~^). The claim follows. □ 

We can now give a criterion for when an average of the form E„ e [jv]f (0(n)) is 
large. 

Theorem D.6 (Ratner-type theorem). Let G/T be N-filtered nilmanifold of some 
degree d, let O G *poly(ZN — > (G/r)^) be a polynomial orbit, and let 

F G Lip(*(G/r) -»• C^) 

be such that 

\E ne[N] F{0{n))\ » 1. 

Then one has 




F(ex) dfj,(x)\ > 1 



for some bounded e G G and some rational subgroup Gp of G, with the property 
that 

TTHoriz, (G){Gp H Gi) > E i 

where the horizontal space HoriZj(G) and the projection map 7r Horiz .(- G ) : Gj — > 
Horiz,j(G) was defined in Definition \9.b\ 

E± := {x G Horizi(G) : & (x) = for all & G Hj} 

and Si sC Horizj(G/r) is the group of all (standard) continuous homomorphisms 
& : Horizj(G/r) ->■ T such that 

^(Taylor 2 (0))-0(7V- 1 ). 

One could also generalise this theorem to multidimensional orbits, but we will 
not need to do so in this paper. We will motivate this theorem with some examples 
after the proof. 

Proof. By taking components we may assume that F is scalar- valued. Write 0(n) = 
g(n)*T for some g G *poly(Zfj — > Gn). We partition [N] into dense arithmetic 
progressions P induced from the partition of [N] coming from Theorem ID.5I (using 
the Chinese remainder theorem and passing to dense subprogressions as necessary) . 
By the pigeonhole principle, for at least one of these progressions P one has 

\E neP F(g(n)*r)\ » 1. 

Now let 6 > be a small standard number to be chosen later. By further parti- 
tioning of P and the pigeonhole principle one can assume that P has diameter at 
most 5N (note that the implied constant in the ^> notation remains independent 
of 6 when doing so). Then for any no G P, ep(n) and ep(no) differ by G"(<5), and so 
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(by the Lipschitz nature of F) F(g(n)*T) differs from F(ep(n )gp(n)*T) by 0(5). 
Thus, for 6 sufficiently small, and setting e := ep(no), one has 

|E„ e pF(e. gp (n)*r)| > f. 

Using the total equidistribution of gp, we have 

E neP F(eg P (n)*T) = [ F{ex) d^x) + o(l) 

JGp/Tp 

and so 

/ F(ex) dfJ,(x) > 1. 

JGp/Vp 

To finish the proof of Theorem ID.6| we need to show that 

7T HoiizdG} (G P nG l )^Ei (DA) 

for all positive standard integers i, with E$ as in Theorem ID. 61 

Fix i. To show the above claim, observe that gp takes values in Gp, and so 
Taylor^gp) S 7]"Horiz;(G/r) (Gp fl Gi). On the other hand, Taylor^gp) differs from 
Taylor^) by 0(N~ i ), and so 

dist(Taylor l ( ff ),7r HoriZl(G /r) (Gp n Gi)) = 0(N- 1 ). (D.5) 

Suppose the inclusion (|D.4|) failed. Then by duality (and the rational nature of Gp), 
there exists a standard continuous homomorphism £j : Horizi(G/r) — > T outside of 
Hj which annihilates Hi(Gp fl Gi). From (|D.5|) . This implies that 

& (Taylor,^)) = 0(N~ l ), 

and thus G Hj by definition of S,-, contradiction. The claim follows. □ 

To get a feel for this proposition, let us first examine a simple special case, when 
G/r is just a two-dimensional torus T 2 , and O is a linear orbit 0(n) := (an, f3n) 
for some a,/3 € *T. We take F to be a standard Lipschitz function from T 2 to C. 
Our hypothesis is then the assertion that 

\E ne[N] F(an,Pn)\ > 1. 

The conclusion is then that 

| / F(e + x) d[i T (x)\ > 1 
Jt 

for some subtorus T := Gp / (Gp flZ 2 ) of T 2 , where e 6 T 2 and Gp is a rational 
subgroup of K 2 . Furthermore, Gp contains the subgroup 

Zi := {x e R 2 : £(x) = for all £ G Si} 

and Si is the subgroup of 1? defined by 

Si :=Uea 2 :C'(a,/3) = Ort 

We investigate some subcases of this result. First consider the case when a,j3 are 
both within 0(N~ 1 ) of standard rationals. Then Si is a finite index subgroup 
of I? , and so Sj 1 is trivial. The conclusion is then simply the trivial conclusion 
that 3> 1 for some e € T 2 , which was of course obvious from the pigeonhole 

principle. 

Now suppose that (3 is within O(N^) of a standard rational p/q with p, q 
coprime, but that a does not lie within 0(N^ 1 ) of a standard rational. Then 
Si = {(0,qa) : a £ Z}, and so = M x {0}. The conclusion is now that 
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\J T F(x,e) dx\ ;§> 1 for some e £ T. This can also be seen directly by observing 
that on any subprogression of [N] of spacing q and length SN for some small S > 0, 
the orbit 0(n) is within 0(5) of being equidistributed on a coset T x {e} of T x {0} 
for some e 6 T, with the implied constant in the 0(S) notation independent of 5. 
The claim then follows from the pigeonhole principle (choosing S sufficiently small, 
but still standard) and the Lipschitz nature of F. 

Finally, suppose that a, j3 are incommensurate in the sense that there does not 
exist any non-zero £ £ 1? for which £ • (a, b) = 0(N^ 1 ). Then Si is trivial and so 
E,f- = R 2 . The claim is then that | J T2 F(x,y) dxdy\ ^> 1, which is also apparent 
from the equidistribution of O in T 2 in this case. 

One can also repeat the above example with the linear orbit n i— > (cm, j3n) 
replaced by a polynomial orbit such as n i— > (an D , (3n D ) for some standard D ^ 1. 
The discussion is identical, except that the 0(N~ 1 ) errors must now be replaced 
by 0{N~ D ). 

Now we consider the more general non-abelian setting, in which G/T is not 
necessarily a torus (i.e. we allow d to exceed 1). We first remark upon the "in- 
commensurate" , "generic" , or "equidistributed" case when all the Hj are trivial, i.e. 
there are no non-trivial relations of the form 

6(Taylor ? (0)) = 0(N~ i ). 

In this case, = Horiz;(G) and so the maps 7Tj : Gp n Gi — > Horiz(G;) are all 
surjective. This implies that all the horizontal spaces of the quotient group G /Gp 
are trivial, which one easily sees to imply that G/Gp itself must be trivial, i.e. that 
Gp = G. We conclude that | J G , T F d/j,\ ^> 1. Indeed, in this case it turns out that 
O is totally equidistributed and 

E neP0 F(O(n)) = [ Fd[i + o(l). 
Jg/t 

This fact can also be deduced from the arithmetic counting lemma |27[ Theorem 
1-11 • 

Finally, to illustrate how we actually use Theorem ID. 61 in practice, we consider 
a model problem in which we are given frequencies a,j3,a',/3' £ *T obeying the 
correlation property 

\E ne[N] e({an}pn)e({a'n}l3'n)\ » 1, (D.6) 

and we wish to conclude some constraints between these four frequencies; infor- 
mally, the problem here is to determine for which frequencies a,j3,a,/3 can one 
have a non-trivial relationship between {an}f3n and {a'n}/3'n (cf. (|6.4[1 ). Strictly 
speaking, for the analysis that we are about to give to apply, we must first replace 
the bracket polynomial expressions above by suitable vector-valued smoothings (or 
else develop analogues of the above equidistribution theory for piecewise Lipschitz 
nilsequences, as was done in the d = 2 case in [28 ) , but to simplify the exposition 
we shall completely ignore this technical issue here. 

Ignoring the technical issue alluded to above, we can express the left-hand side of 
(|D.6[) in the form \E nG [ N ]F((D(n))\, where G/T is the product Heisenberg nilman- 
ifold of degree $5 2, generated by four generators ei,e2,e' 1 ,e' 2 with [ei,e2], [e^e^] 
central (and with ei,e2 commuting with e'^e^), 

0(n) := el n er(e' 2 f n (e[r' n T, 
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and F is a (piecewise) Lipschitz function on G/T obeying the vertical frequency 
property 

F([e 1 ,e 2 ] t ^[e' 1 ,e' 2 } t '"x) = e(-t 12 - t' 12 )F(x) (D.7) 

for all x G G/T and ti 2 ,t' 12 6 Note that Horizi(G) is isomorphic to M 4 (be- 
ing generated by the projections of ei,e 2 ,e[,e 2 via 7THorizi(G)); while Horiz2(G) is 
trivial. Applying Theorem ID.6| we conclude that 

/ F{ex) dn(x)\ » 1 

JGp/Vp 

for some bounded e € G and some rational subgroup Gp of G, with the property 
that 

^Horizi (G)(Gp) > (D.8) 

where 

Si := U G Z 4 : £ ■ (a, (3, a', /?') = O^ 1 )}. 

If the vertical group Gp DG 2 contains any element [ei, e 2 ] tl2 [e[, e' 2 ] tl2 with — 1\ 2 — 
t' 12 ^ 0, then from (|D.7|) we see that J G i p F(ex) d^i(x) — 0, a contradiction. We 
conclude that 

G P nG 2 C ([ei^K^pV (D.9) 

This gives us some information concerning the group Si, and hence on the frequen- 
cies a,j3,a',j3'. Indeed, suppose that we are given two elements (a,b,a' ,b') and 
(c,d,c',d') in H . By (|D.8I) . we conclude that Gp contains two elements g,h such 
that 

g = eleKeif (e' 2 ) b ' mod G 2 

and 

h = 44(e'iY \e' 2 ) d ' mod G 2 . 
Since g and h lie in Gp, the commutator 

[g,h] = [e 1 ,e 2 r d - bc [e' 1 ,e' 2 r' d '- b ' c ' 

must also lie in Gp. Comparing this with (|D.9I) we obtain an algebraic constraint 
on S that prevents it from being too small, namely that 

(ad - be) + (a'd' - b'c') = (D.10) 

whenever (a, b, a', 6'), (c, d, c 1 , d') £ Z 4 are both orthogonal to S; thus the symplectic 
form (jD.lOj) must vanish when restricted to S 4- . 

For instance, suppose that (a', /?') = (/?, a), but that a, j3 are otherwise in general 
position (cf. (|6.4|1 ). Then S is generated by (1,0,0,1) and (0,1,1,0), so S 4 - is 
generated by (1, 0, 0, —1) and (0, 1, —1, 0), and one easily verifies the property. It is 
in principle possible to work out what other quadruples a,/3,a',/3' are permitted 
by Theorem ID. 61 but we will not compute this here. 

Appendix E. Some basic properties of nilcharacters and symbols 

In this appendix we establish some basic properties of nilcharacters and symbols; 
this material is broadly comparable to [2E1 §3] . 

Throughout this appendix, / is understood to be an ordering (see Definition l6.7|) . 
We first begin with some basic closure properties of nilsequences. 
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Lemma E.l (Nilsequences are preserved by Lipschitz operations). Let if be an 
I -filtered group, let J be a finite downset in I, and let be a limit subset of*H. If 
ipi £ m\ cJ (Q -> C D ') and Di £ N+ for i = 1, . . . ,m, and F : C Dl x . . . x C Dm -> 
C D is a locally Lipschitz standard function, then F(ipi, . . . , ip m ) £ Wil cJ (Q — > C D ). 

Proof. This follows immediately from Definition 16. 191 and Example 16. 131 □ 

As an immediate corollary we have the following. 

Corollary E.2 (Algebra property). Let if be an I -filtered group, let J be a fi- 
nite downset of I , and let VI be a limit subset of if. Then Nil cJ (Q, — > C) is a 
sub-*-algebra of L°°(Q — > C), that is to say it is closed under pointwise multiplica- 
tion, scalar multiplication by bounded constants, addition, and complex conjugation. 
Similarly, Ni\ cJ (fl — > C ) is closed under complex conjugation, tensor product, and 
bounded linear combinations. 

Remark. From the example after Corollary [B3] we also see that if ip £ L°°(*H — > 
C ) is a nilsequence of degree C J, then so is any translate ip(- + h) or dilate ip(q-) 
of ip for h £*H and q £ Z. 

Lemma E.3 (Basic facts about nilcharacters). Let H = (if, +) be an I -filtered 
abelian group for some I, let d £ I, and let \, x' oe nilcharacters in S <i (*if). Then 
X®x'j x(' + ^)? x(<7')> o,nd\ are also nilcharacters of degree ^ d for every h £ *H, 
and q G Z. 

More generally, if T : if' — >• if is a (standard) filtered homomorphism from 
another I -filtered abelian group if' = (if', +) to if, then x °T is a nilcharacter in 
E d (*H'). 

Finally, one has E d (*if) C E d (*H') whenever d' < d. 

Proof. This follows from Corollary IB.5I (cf. the example after that corollary, and 
Corollary E2J). □ 

From (|6.5[) it is trivial that a multidimensional polynomial of multidegree C JU J' 
can be decomposed as the sum of a multidimensional polynomial of multidegree 
C J, and a multidimensional polynomial of multidegree C J'. There is an analogous 
decomposition for nilcharacters. 

Lemma E.4 (Splitting lemma). Let k £ N + , and let J, J' be finite downsets ofN k . 
Let ip £ Nil cJuJ (* Z k — > C) be a nilsequence, and let e > be standard. Then 

K 

\\^{n) - ^2^k{n)ip' k {n)\\ LX , { ,- Z k ) < e 
fe=l 

where K is standard and for each 1 ^ k ^ K, tpk £ Nil cJ (*Z fe — > C) and ip' k £ 
Nil cJ '(*Z fc ->■ C). 

Proof. We can write ip = F o O, where F £ Lip(*(G/r) — >• C), O £ *poly(Z fe 
G/r), and G/T is an Z fc -filtered nilmanifold with degree C J U J'. 

For each j £ JU J', let e^i, . . . , ey^. be a basis of generators for Tj. We 
may then lift G to the universal nilpotent Lie group that is formally generated by 
the ej t i, subject to the constraint that any r — 1-fold iterated commutator of the 
ej 1 ,i! , • • • , ej r } i r with j\ H \-j r £ JUJ' vanishes, and similarly lift T, F, O (using 



AN INVERSE THEOREM FOR THE GOWERS U 3 + 1 [JVJ-NORM 



101 



Corollarv lB.lOl for the latter). Thus we may assume without loss of generality that 
G is universal. 

The degree C JU J' nilmanifold G/T projects down to the degree C J nilmanifold 
G/GyjT, where G>j is the group generated by the Gj for all j G J'\J. Similarly we 
have a projection from G/T to the degree C J' nilmanifold G/G > j/T. The algebras 
Lip(*(G/G>jr)_-> C), Lip(*(G/G>j-r) -> C) then pull back to subalgebras of 
Lip(*(G/r) — > C). By universality of G, G>j and G>j' are disjoint. Thus, the 
union of these two algebras separate points in G/T. By the Stone- Weierstrass 
theorem, one can thus approximate F to arbitrary accuracy by products of elements 
from these algebras, and the claim follows. □ 

Next, we show that nilsequences can be decomposed into nilcharacters. 

Lemma E.5 (Fourier decomposition). Let H be an I-filtered group, and let d G I . 
If ip G NiY^ d {* H) and e > is standard, then one can find a standard natural 
number m, and nilcharacters \j G S d (*iJ), scalar nilsequences ipj G Nil* 1 (*H), 
and bounded linear transformations T : C° J — > C° for suitable dimensions Dj , D 
for each 1 ^ j ^ m such that 

m 

llV>-^%®Xi)IU~(*ff) 
i'=l 

Proof. It suffices to show this for scalar nilsequences ip. Let G/T be an /-filtered 
nilmanifold of degree < d, let F G Lip(*(G/r) — > C), and let e > 0. We need to 
show that one can approximate F to uniform error at most e by Y)j—\ Tj{Fj ® fj)i 
where each Fj G Lip(G/r -)• S 20 ' -1 ) has a vertical frequency, fj G Lip(G/r -> C) 
is invariant with respect to the G<j action (so that /j descends to the quotient 
nilmanifold G/GdT, which has degree < d), and the Tj : C Dj — > C are linear 
transformations. 

Observe that the class of functions of the form Y^jLi form a complex 

algebra that are closed under conjugations. Thus by the Stone- Weierstrass theorem, 
it suffices to show functions of the form F ® /, where F G Lip(G/r —> S 20-1 ) has 
a vertical frequency and / G Lip(G/r — > C) and is invariant under Gd, separate 
points. This is trivial for two points which descend to distinct points on G/GdT, 
so it suffices to do so for two points on a common Gd fibre. For this, it suffices by 
the definition of vertical frequency to show that for each g G Gd with g <^Td, there 
exists a function F G Lip(G/r — > 5 2D_1 ) has a vertical frequency r\ with r/(g) g 1 Z. 

The existence of a character rj : Gd — > K with 77(17) Z is guaranteed by Pon- 
tryagin duality. Fixing such an 77, we now perform the same construction used at 
the start of fj6] (i.e. smoothly partition the base space G/GdT into balls of small 
radius) to generate the desired function F. □ 

Corollary E.6 (Correlation). Let H be an I-filtered group, let d G I , and let f2 be 

a limit finite subset of * H . If f G L°°(Q,) is ^ d-biased, then f correlates with a 
nilcharacter in S d (f2). 

Proof. We assume inductively that the claim has already been proven for all smaller 
values of d. We may assume that / is scalar. Applying Lemma IE. 51 for e small 
enough, we see that / correlates with an expression of the form Xw=i ^j'C^j ® Xj)i 
and thus by the pigeonhole principle, / correlates with one of the ipj (g> Xii an d 
thus fxj correlates with ipj. We can express the downset {i G I : i < d} as the 



102 



BEN GREEN, TERENCE TAO, AND TAMAR ZIEGLER 



finite union of downsets {i £ / : i ^ d'} for various d' < d. Applying Lemma lE.41 
repeatedly for sufficiently small e, we thus see that fx] correlates with Yid'Kd^d,' > 
where each ip d i is a nilsequence of degree ^ d' . Applying the inductive hypothesis 
repeatedly, we thus see that fx] correlates with ® d , <d Xd' f° r some nilcharacters 
Xd> 01 degree < d', and so / correlates with Xj ® ®d'<d Xd'- The claim now follows 
from Lemma [E. 31 □ 

We turn now to a discussion of the basic properties of symbols. We begin by 
clearing up a small issue left over from £j6j that of proving that the notion of equiv- 
alence we introduced in Definition 16.221 is indeed an equivalence relation. Recall 
that nilcharacters x and x' were said to be equivalent if x ® x' is a nilsequence of 
degree strictly less than d. 

Lemma E.7. Equivalence of nilcharacters, thus defined, is an equivalence relation. 

Proof. The symmetry is obvious. For transitivity, suppose that xi ~ X2 and that 
X2 ~ X3- Then each component of 

(Xl ® Xl) ® (X2 ® Xl) = Xl ® (X2 ® X2) O X3 

is a nilsequence of degree strictly less than <i. However the trace of X2^X2 is 1, and 
so xi ®X3 is a combination of the components of xi ® (X2 &X2) ® X3- In particular, 
it is a nilsequence of degree strictly less than d. 

To show reflexivity, we must confirm that x ® X is a nilsequence of degree -< 
d for any nilcharacter x G If we write x( n ) = -f 1 (.9( n )*r)i where F £ 

Lip(*(G/r) — ► #") has a vertical frequency 77, we have 

X®X(«) = (F ®F){g{n)*T). 

Noting that F ® F is invariant with respect to the Gd action, we may quotient out 
by this central group and represent x®X using a nilmanifold of degree -< d. □ 

The space Symb d (0) has many nice properties. 

Lemma E.8 (Symbol calculus). Let H — (H, +) be an abelian I -filtered group, let 
d £ I, and let be a limit subset of *H . 

(i) IfXiX' S S d (f2) and ip £ Nil <d (f2), and the components of x are bounded 
linear combinations of those of x' ® V*) then x> x! are equivalent on f2 and 

thus [x]symb d (Q) — [x']symb d (Q)- 

(ii) Conversely, if XjX' S S d (f2) are equivalent, then ^ k a bounded linear 
combination of x' ® f or some 

111 ) Symb d (f2) is an abelian group with the group operation induced from tensor 
product. 

(iv) If x £ S d (*7J) and h £ *Hi for some i > 0, then x and x{' + h) are equiv- 
alent on *H (and thus on Q also). Thus, [x(- + h)] Symh d {n) = [x]s ym b d (Q) • 

(v) If H = 1i k with either the multidegree or degree filtration, x £ Z d (*H) and 
q £ Z, then x® ql ' and x(<Z") are equivalent on * H (and thus on also), 

thus [x(<?-)]s.ymb d (Q) = 9 |d| M S ymb d (0)- 

(vi) (Pullback) If T : *"L k — >■ *Z k is a linear transformation, and x is a 
nilcharacter of degree d on *"L k , then x T is a nilcharacter of degree 
d on Z . Moreover, if x' is another nilcharacter of degree d on *"L k with 

[x]s y mb d (*Z fc ') = [x']symb d (*Z fc ')' thtn [X ° 7lsymb d (*Z fe ) = W r ]symb d (*Z<=) ■ 
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(vii) (Divisibility) If H = Z fe with either the multidegree or degree filtration, 
d / 0, x 6 5 d (i7) and q G N + , then there exists x such that [x]symb d (si) = 

Symb d (0)- 

Proof. The claim (i) follows from the same argument used to prove reflexivity in 
Lemma IE. 71 For (ii) , we proceed much as in the proof of transitivity in Lemma 
IE. 71 write 4> '■= X® X' an d consider x' ® 4> — (x' ® X') ® X- Since 1 may be written 
as a linear combination of the components of x' ® X'i t ne claim follows. 

The claim (iii) follows easily from (i) and (ii). Part (iv) is more substantial. It 
should be compared to some of the consequences of the "bracket quadratic identi- 
ties" developed in 28, Lemma 5.5]. 

From Definition 16.221 it suffices to show that the derivative Ahx( n ) '■= x( n + 
h)® X (n) lies in Ni\ <d (*H). We write x(n) = F(g(n)*T), where G/T is an /-filtered 
nilmanifold of degree ^ d, g G *poly(ifj — » G/), and F G Lip(* (G/T)) has a vertical 
frequency rj : Gd — > R, then we have 

A/,x(n) = F((d h g(n))g(n)*T) ®F(g(nyr). 

As g G *poly(// — > G) and h G *Hi, we have G *poly(/// — > Gj' 1 ), where 
G^ 1 = (Gj + i)j e i is the shifted filtration. 

We now give G 2 an /-filtration by defining (G 2 )j to be the group generated by 
Gj +i x id and by the diagonal group {(g,g) ■ g G Gj}. One easily verifies that 
this is a filtration, which is rational with respect to T 2 . In particular, if we set 
G D := (G 2 ) and r D := T 2 n G n , we have that G D /r D is an /-filtered nilmanifold 
of degree ^ d. Furthermore, from Corollary IB .41 we see that the map 

0:n^(d h g(n)g(n),g{n)YY a 

lies in *poly(/// ->■ G^/T^). We can thus wrote A hX = F o O, where F G 
Lip(*(G D /r D )) is the function 

F(x,y) := F{x) ®F(y). 

This is still a degree ^ d representation. But observe from the vertical character 
nature of F that F is invariant with respect to the action of the group G^ = 
{(9 j 9) '■ 9 G Gd}- Thus we may quotient by this map and descend to a degree < d 
nilmanifold, and the claim follows. 

Now we turn to (v), which is a similar claim to (iv). Writing x — F(g(n)*T) as 
before, we reduce to showing that 

n ^ F(g{qn)*T) ® F(g(n)*^)® q,d, (E.l) 

can be represented as a nilsequence of degree < d, with the convention that F®~ q = 
F® q to deal with the case of negative exponents. 

We give G 2 a N fe -filtration by declaring Gf to be the group generated by Gj x 
Gj for all j > i, together with the set {(<? 9 ' l ',g) : g G G^}. From the Baker- 
Campbell-Hausdorff formula one easily sees that this is a filtration, which is rational 
with respect to T 2 ; and so Gq/Tq is a degree ^ d nilmanifold. Also, from Taylor 
expansion (Lemma IB.9|) and Corollary IB. 41 we see that the map 
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lies in *poly(ff -> Gg/Tg). We then write (EIJ as n 4 F(0(n)), where F 6 
Lip(*(Gg/r§)) is the function 

F(x,y):=F(x)®F® q . 

From the vertical character nature of F, we see that F is invariant with the action 
of G\ — {{g qldl ,g) : g £ Gd}- Quotienting out by this group as in the proof of (iv) 
we obtain the claim. 

The claim (vi) follows easily from Corollary IB.51 so we now turn to (vii). We 
will prove this for the multidegree filtration, as the degree filtration is similar. As 
usual we write x = F(g(n)*T). Applying Taylor expansion (Lemma IB.9[) and the 
Baker-Campbell-Hausdorff formula, we may factorise 

9(n) = I[9f 

for some Qj G Gj, where the product is over all multiindices j ^ d (arranged in 
some arbitrary fashion). Taking roots of each of the gj, we may write gj = (g'j) q 
for each j. We then have g(n) = g'(qn), where g' is the polynomial sequence 

3» ■ 11%'" • 

If we write x'(n) := F(g'(n)*T), we thus see that \' <G 5 d (f2) and x( n ) — x'(Q n )i 
so by (v), [x] Sy mb d (0) = 9 |d| [x']symb d (0)- The claim now follows by setting x ■= 

(x')®*""" 1 . □ 

If P(n) = do + . . . + ctd,n d is a polynomial of one variable n of degree d, then 
P is equal (up to degree < d errors) to the multilinear form Q(n, . . . , n), where 
Q(ni, . . . , tid) :— otdtii . . . iid- A bit more generally, if P(ri\, . . . , rik) is a polynomial 
of k variables n\, . . . , of multidegree d — (di, . . . , dk), then P is equal (up to 
degree < d errors) to a degree (1, . . . , 1) form Q(ni, . . . , ni, . . . , nk, ■ ■ ■ , Ufc), where 
1 is repeated \d\ times and each n, is repeated di terms. We may generalise this 
observaton to nilcharacters. We begin with the simpler k = 1 case. 

Proposition E.9 (Multilinearisation, k = 1 case). Let d G N anc! x G S d (*Z). 
T/ien there exists x S S^ 1 ' "' 1 ^ (*Z d ) (where 1 is repeated d times) such that the 
nilcharacter 

X 1 ■ n x( n , ■ ■ ■ > n) 

(where n is repeated d times) is equivalent to x W 3 (*Z) (i/iws [x]s d (*z) — [x']3 d (*z))- 
Furthermore, one can select x( n i: ■ • ■ > n d) to be symmetric with respect to permu- 
tations of Ti\, . . . , rid- 
To motivate this proposition, we present an "almost-example" of this proposition 
in action: if d = 2 and x is the degree 2 almost-nilcharacter 

X(n) := e({an}(3n), 

(where the "almost" arises because the relevant function F is only piecewise Lips- 
chitz rather than Lipschitz, as discussed at the start of §S| then one can take 

X(ni,n 2 ) := e(^{an 1 }f3n 2 + ^{an 2 }(3n 1 ) (E.2) 
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which is a multidegree (1,1) almost-nilcharacter, with x(n, n) equivalent (and in 
fact exactly equal, in this case) to \(n). More generally, if we are able to represent 
a nilcharacter in terms of bracket polynomials of the correct degree and rank, then 
the above proposition becomes obvious by inspection. Such a representation is in 
fact possible (by extending the theory in 46 ), but we will proceed here instead by 
using abstract algebraic constructions. 

Proof. This will be a more complicated version of the argument used to establish 
claims (iv), (v) of Lemma TE.8I It will be convenient for technical reasons to con- 
struct x so that x' is equivalent to x® d ' rather than to x itself; to recover the 
original claim in the proposition, one simply appeals to Lemma lE.8f vip . 

We have x( n ) — F{9{ n )*T) f° r some degree d nilmanifold G/T, some polynomial 
sequence g G *poly(ZN -t (G/T)®), and some F G Lip(*(G/r) obeying the 

vertical frequency property 

F{g d x) = e( V (g d ))F(x) 

for all x G G/T and g d G G d , where r\ : G d — > R is a continuous homomorphism 
that maps T d to the integers. 

We now build the various components G, fj, g, F required to construct x- (A 
simple example of this construction will be given after the end of this proof.) 

The first step is build the multidegree (1,...,1) nilpotent group G. We will 
construct this group via its nilpotent Lie algebra log G. As a (real) vector space, 
this Lie algebra will be given as a direct sum 

logG := ©j C {i,...,<i}logG|j|. 

For each J C {1, . . . ,d}, let tj : logGiji — > logG be the vector space embedding 
indicated by this direct sum, thus every element of log G can be uniquely expressed 
in the form J2j C {i....,d} l j( x j) for somc X J e lo S G |J|- 

Next, we endow log G with a Lie bracket structure by declaring 

[lj(xj),LK(yK)] = o 

whenever J, K C {1, . . . , d} intersect and xj € log G\j\ , yx S log G\k\ , and 

[lj{xj),LK(yK)] = tJUK([xj,yK}) 

whenever J,K C {1, ...,<£} are disjoint and xj € logGiji, \)k G logGij<-|. One 
easily verifies that this operation obeys the axioms of a Lie bracket (i.e. it is bilinear, 
antisymmetric, and obeys the Jacobi identity), and so logG is a Lie algebra. 

We now give logG a multidegree filtration. For any (a%, . . . ,a d ) G N d , let 
logG( ttli ... )0(J ) be the sub-Lie-algebra of logG generated by the ij(xj) for which 
lj(j) a j f° r each j = l,...,d, and xj G Giji. One easily verifies that this is 
a multidegree filtration of multidegree (1, . . . , 1), and so one can exponentiate to 
create a multidegree-filtered Lie group G of multidegree (1, . . . , 1) also. 

We define a lattice T in G to be the group generated by exp (M \ij (log jj)) for 
all J C {1, ...,d} and jj G T\n, where M is a fixed natural number (depend- 
ing only on d) which we will assume to be sufficiently large. From the Baker- 
Campbcll-Hausdorff formula we see that this is indeed a lattice, and so G/T is a 
nilmanifold. For M large enough, we see from further application of the Baker- 
Campbell-Hausdorff formula that f^i,...,!) is contained in ...^(logl^). 
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Next, we define a vertical frequency 77 on 1) by setting 

f){<>(i,...A)( l °Egd)) ■= Vi9d)- 

One easily verifies that fj is a vertical frequency (here we use the inclusion Tn ...,i) C 
i)(logr,i) and the central nature of Gn u). 

Now let F 6 Lip(*(G/r) — > S u ) be a function with vertical frequency fj; such a 
function can be constructed using partitions of unity as in (16.31) . 

The next step is to define g. From Lemma IB. 91 and many applications of the 
Baker-Campbcll-Hausdorff formula, we may write 

fl(»)=n#' 

j=o 

for some coefficients gj G Gj. We then write 

d 

g(ni,...,n d ) := J|exp(j! ^ (J| n^tj (log ^O)- 

j=0 JC{l,...,d}:|J|=j iGJ 

Observe that each individual monomial 

(ni,...,n d ) n- exp(j!(JJn i )tj(logp i )) 

with ^ j ^ d and |J| = j is a polynomial map in *poly(Z^ d — > G^d), so by 
Corollary IB. 41 and the Baker-Campbell-Hausdorff formula we see that the same is 
true for g. 

Finally, we set 

X(ni, . . . , n d ) := F(g(ni, n d )*T). 

By construction, \ G S( 1 '-"' 1 )(*Z <i ), which by Lemma |K8^vi) (and the embeddings 
in Example 16. lip implies that x' 6 S d (*Z). It is also clear that x is symmetric with 
respect to permutations of the m, . . . , rid- It remains to show that x' is equivalent 
to x m ' in 2 d (*Z), or in other words that the sequence 

n i ^ x( n )® d! ® x(™, ■ • ■ ,n) 
is a nilsequence of degree < d. We expand this sequence as 

(F® d! ®F) ( {[fa, exp(jl £ O0og ffi ))) n, *(rxf) 

\i=° JC{l,...,d}:|J|=j , 

The function F®F is a Lipschitz function on the nilmanifold (GxG)/(T x T). Let 
G* be the subgroup of G x G defined as 

G* := {(gd,exp(dh(i,...,i)(logg d )) : 5d G G d } ^ G d x G^...,!). 

This is a rational central subgroup. As F and F have vertical frequencies rj and 77 
respectively, we see that F ® F is invariant in the G* direction, and thus descends 
to a Lipschitz function F' on the nilmanifold G'/T', where G' := (G x G)/G* and 
r' is the projection of T x T to G'. We thus have 

d 

X (n) ® J(n, . . . ,n) = F'(J]( ff ;.)"'*r') (E.3) 
3=1 
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where g'j is the projection of (g jt exp(j! Ej C {i,...,d}:| J\=j L j( lo S9j))) to G' . 

We now give G' a degree filtration by defining G'j to be the group generated by 
elements of the form 

(/ij,exp(j! ^2 lj (log hj))) mod G* 

JC{l,...,d}:\J\=j 

for hj G Gj , together with elements of the form 

(h j+1 ,id), (id,exp(tj(log/i J+ i))) mod G* 

for hj+\ <E Gj+i and J C {l,...,ef} with |J| = j + 1. By a tedious number 
of applications of the Baker-Campbell-Hausdorff formula, we see that this is a 
filtration of degree < d (here we use the fact that every set of cardinality j + k 
has ^jrr partitions into a set J of cardinality j and a set if of cardinality k, 
which cancels the j\ prefactors appearing in the definition of Gj). By construction, 
g'j € G'j. Thus the right-hand side of (|E.3|) is a nilsequence of degree < d, and the 
claim follows. □ 



Example. We illustrate the above proposition with the simple d = 2 example 
mentioned before the proof. We consider a nilcharacter \ that is a vector-valued 
smoothing of the sequence n >— ^ e({an}/3n) for some fixed frequencies a, j3 € *T, 
which we will write schematically as 

x(n) ~ e({an}/?n). 

As discussed in fJSJ such a nilcharacter arises from the Heisenberg nilmanifold (16. ip 
with the polynomial sequence 

ff (n) = ef" e?" 

and vertical character 7?([ei, e2]' 12 ) := — ti2- We may Taylor expand g as 

ff(n) - 9? 9 f 

where g\ :— exp(alogei + /31oge 2 ) = e"e2 [ei, &-2\~ a ^^ and 52 := [ei, e2\~ a ^^ 2 - 
The nilpotent Lie algebra log G is the seven-dimensional vector space 

log G = log G © log G © log G12 

with a basis of this space given by 

ii(logei),ii(loge 2 ),ii(log[ei,e 2 ]),i2(logei), t 2 (loge2),i2(log[ei,e2]),ii2(log[ei,e 2 ]). 

(E.4) 

The Lie algebra commutation relations on basis elements are given by the formulae 

[ii(logei), t 2 (loge 2 )] = ti2(log[ei,e 2 ]) 
[ii(loge 2 ), t 2 (logei)] = -ii2(log[ei,e 2 ]) 
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with all other pairs of basis elements commuting. This gives a nilpotent Lie group 
G generated (as a Lie group) by the exponentials of (|E.4|) . which we will label as 





= exp(ti(logei)) 


0,2 


= exp(ii(loge 2 )) 


ai2 


= exp(ii(logei 2 )) 


bi 


= exp(i 2 (logei)) 


b 2 


= exp(t 2 (loge 2 )) 


bi2 


= exp(t 2 (logei 2 )) 


Cl2 


= cxp(ti 2 (logei 2 )), 



thus one has the group commutation relations 

[01, b 2 ] = c i2 ; [02,61] = c^ 1 

with all other pairs of generators commuting. The generators ai 2 , &i 2 will play no 
essential role in the analysis that follows and may be ignored by the reader. 
The group G is a multidegree (1,1) filtered Lie group with filtration 

C(o,o) := C; 

^(1,0) := ( a li a 2i a 12, Ci 2 )r| 

G(o4) := (6i,&2,6i2,ci2)r; 

G(ia) '■= (ci 2 )e. 
To construct L, we may take M = 1, so that 

L := (01,02,012, 61, 62,612,012). 
From the Baker-Campbcll-Hausdorff formula one sees that 

f (1,1) := f n G (14) = (ci 2 ). 
A typical element of G/T can be parameterised as 



for r 1 ,r 2 ,r 1 2,s 1 ,S2,s 1 2,t 1 2 G I Q . 

The polynomial sequence g is given as 

£f(ni,n 2 ) := exp(niti(loggi) + n 2 t 2 (logoi)) exp(2nin 2 ii 2 (log# 2 )) 
= exp(cmi logai + /3ni loga 2 + cm 2 log 61 + /3n 2 log6 2 )x 
x exp(-a/3nin 2 logci 2 ) 
which by the Baker-Campbell-Hausdorff formula expands to 
o(ni,n 2 ) = o 1 X 1 b 1 2 b% 2 c 12 p 2 . 

This is clearly a polynomial sequence. If we then let rj : Gn,i) — > K be the vertical 
character 

77(exp(^i 2 ti 2 (log[ei,e 2 ]))) := -tyi 
and let F : G/T — > S 1 be the (piecewise) Lipschitz function 
F(a^a^a^bl%^c^f) := e(-t 12 ) 
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for n, r 2 , ri2, si, S2, S12, £12 € ibj then the sequence 

X(ni,n 2 ) := F(g(n 1 ,n 2 )*f) 

is almost a nilcharacter of multidegree (1,1), if we make the usual cheat of ignoring 
the fact that F is only piecewise Lipschitz rather than Lipschitz. 
Now let us look at the diagonal sequence 

x(n,n)=F(ar4 n brbl n c^ n2 *t). 

A brief computation using the Baker-Campbell-Hausdorff formula shows that one 
can rewrite 

r ,an J3n,cm h pn -afin 2 *f, 
"l a 2 "1 u 2 c 12 1 

as 

{an} n {0n} h {an} h {Pn} {an-{an}){f3n}-(l3n-{l3n}){an}~ a /3n 2 
a 1 a 2 i>i u 2 c 12 i . 

Noting that (an — {an})((3n — {(3n}) is an integer (cf. (|6.4I) ). we can write the Cyi 
exponent modulo 1 as 

{an}{/3n} — 2{an}/3n mod 1 

and thus 

x(n,n) = e(2{an}f3n)e(— {an}{/3n}). 

The second factor e(— {an}{(3n}) is a piecewise Lipschitz function of (an mod 1, 
fin mod 1) and is thus almost a 1-step nilsequence. We thus see that x(n, n) is 
almost equivalent (as a degree 2 almost nilcharacter) to x( n ) 2 ■ To eliminate the 
exponent of 2, one can go back to the start of the argument and replace j3 (for 
instance) by /3/2. The reader may verify that once one does so, the almost nilchar- 
acter x is essentially equal to (|E.2|) . 

Finally, we mention that with the above example, the group G* takes the form 

G* :={([ ei ,e 2 ] tl2 ,c? 2 12 ) :t 12 G M} 

and the group G' := (G x G)/G* has the degree 1 filtration 

G' := G' 

G' x := {(e t 1 1 e 2 2 [el,e 2 ] tl ^a ^ 1 1 a 2 2 a ^ 1 ' 2 2 6 t 1 1 6 2 2 & ^ 1 ' 2 2 c t 1 | ! : txMMiAiAi e M l mod G * ■ 

One can verify by hand that this is indeed a degree 1 filtration on G", which explains 
why x( n ) 2 x( n i n ) is a degree 1 (almost) nilsequence. 

This concludes the discussion of the example. Now we generalise Proposition 
lETOl to higher k. 

Theorem E.10 (Multilincarisation). Let 51 be a limit subset ofZ k , which we give 
the multidegree filtration. Let d — (di, ■ ■ ■ , dk) S N k , and x S S d (ri). Then there 
exists x S s( 1 ''"' 1 )(*Z' d ') (where 1 is repeated \d\ times) such that the nilcharacter 

X ■ (ni,...,7i fc ) !->■ x(m, ni, n 2 , n 2 , ...,n k , ■ ■ ■ ,n k ) 

(where each ni is repeated di times) is equivalent to x in S d (J7) (thus [x]s d (n) — 
[x']H d (n)J- Furthermore, one can select 

X(ni,u ■ • • , «i, di, «2,i, • ■ • ,n-2,d 2 , • • ■ ,Wfc,i, • • ■ , n k,d k ) 
to be symmetric with respect to the permutation of m t \ , . . . , for each i — 
1 
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Proof. Without loss of generality we may take fl = *Z . The argument is exactly 
the same as that used to establish Proposition lE. 91 except that the notation is more 
complicated. Accordingly, we will focus primarily on the notational setup in this 
proof. 

As before, it will suffice to make x' equivalent to x® d ' rather than Xi where 
d\ := We have x( n ) — F(g(n)*T) f° r some multidegree d nilman- 

ifold G/T, some polynomial sequence g £ *poly(Z^ fc —5- (G/T)^), and some 
F £ Lip(*(G/r) — > S^), obeying the vertical frequency property 

F(g d x) = e(r](g s ))F(x) 

for all x £ G/T and gd £ Gd, where rj : Gd — > K is a vertical frequency. 

As before, we begin by bulding the nilpotent Lie algebra log G. As a (real) vector 
space, this Lie algebra will be given as a direct sum 

logG := ®j C {i,...,|d|}logG||j|| 

where ||J|| € N fc is the vector 

||J|| := (|Jn{d x + ... + dk-i + + 

For each J c {1, . . . ,d}, let lj : log Gun — > logG be the vector space embedding 
indicated by this direct sum. Next, we endow log G with a Lie bracket structure 
by declaring 

V.j{x,j),i K {y K )] = 

whenever J, K C {1, . . . , d} intersect and xj £ log G\\j\\ , dk £ log G\\k\\ , and 

Vj{xj), lk(vk)] = l-juk([xj, Vk]) 

whenever J,K C {1, ...,d} are disjoint and xj £ logG||j||, ijk £ logGi|jq|- As 
before, one easily verifies the Lie bracket axioms. 

We now give logG a multidegree filtration. For any (ai, . . . , aui) £ N' d ', let 
logG( aii ... i0 ) be the sub-Lie-algebra of logG generated by the lj(xj) for which 
^ a j f° r each j = 1, . . . , \d\, and xj £ G\\j\\. As before, this is a multide- 
gree filtration of multidegree (1, . . . , 1), and exponentiates to create a multidegree- 
filtcred Lie group G of multidegree (1, . . . , 1) also. 

We define a lattice T in G to be the group generated by exp(M!ij(log7 :) )) for 
all J C {l,...,|c£|} and jj £ Again, this creates a nilmanifold G/T is a 

nilmanifold, and for M large enough, rn,.. i) is contained in Lt\ t , (log Td)- 

As before, we define a vertical frequency f) on Gm n by the exact same formula: 

f){t>(i,...,i){ l °S9d)) ■= Vi9d), 

and then construct F £ Lip(*(G/T) — > S u ) with vertical frequency fj. 
The next step is to define g. As before, we have the Taylor expansion 

9(n) = I[9? 

for some coefficients gj £ Gj, where j = (ji, . . . ,jk) now ranges over multi- indices 
less than or equal to d, arranged in some arbitrary order (e.g. lexicographical will 
suffice). We then write 

g(ni,...,n\ d \) := JJ exp(j! (Y[n t ) l,j (log g^), 

j^d JC{l,...,|d|}:||J||=j ieJ 
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recalling that j\ := j\\ . . . jk\. As before, one verifies that g is a polynomial map. 
Finally, we set 

X(ni,...,n\ d \) := F(g(nx, . . . ,n\ d \)*f). 

The rest of the argument proceeds exactly as in Proposition lE.9l the main difference 
being that d is replaced with and \J\ with ||J||, whenever necessary; we omit 
the details. □ 

Now we show how nilcharacters interact with the concept of bias. 

Lemma E.ll (Bias lemma). Let k, d £ N + with d 2, let x be a degree d nilchar- 
acter on *Z fc (with the degree filtration) , and let N be an unbounded limit natural 
number. Let Q be a convex polytope in [[N]] , let Pi, . . . , Pk be dense subprogres- 
sions of [N]. Suppose that ln(n)lp 1 x...x.p k x( n ) * s < d-biased on [[N]] k . Then on 
[[N]] , x i s equal to a nilsequence of degree < d. 

Remark. Note that the claim fails for d = 1, even when fe = l;ifg>lisa 
bounded integer, then the degree 1 nilcharacter n H ► e(n/q) is of course biased on 
progression of spacing q, but not on the original interval [[N]]. However, this is a 
purely "degree 1" obstruction and vanishes for higher degree. 

Proof. Write P := Pi x . . .Pk- By Corollary IE. 61 ln(n)lpx(n) correlates with a 
nilcharacter of degree d — 1; we may absorb this nilcharacter into Xi an( i assume 
that ln(n)lpx(ro) is in fact biased. 

By partitioning ft D P into the product P' = P[ X . . . X Pj. of dense progressions 
of [[N]] (using [23l Corollary A. 2] to control the error), we see that there exists such 
a product P' = P{ X . . . X P' k for which 

\E neP ,x(n)\ > 1 

Write x = FoO for some degree ^ d nilmanifold G/T, some F £ Lip(*(G/r)) with 
a vertical frequency 77, and some O G *poly(Z fe — > G/T). Applying Theorem ID. 5 1 
and using the pigeonhole principle to refine the progressions P[ , . . . , P^ if necessary, 
we may assume without loss of generality that we can factorise 

0(n) =e P .(n)gp,(n)*T 

for all n e P', where gp> G *poly(Z fc — > Gp>) is totally equidistributed on Gpi/Tpi 
for some standard rational subgroup Gp' of G, and ep> £ *poly(Z fc — > G) being 
bounded and having Lipschitz constant 0(1/ N) on P, and with the i th Taylor 
coefficients of size 0(N~^) for each i G N k . 

For any n,np> £ P' , we have from the Lipschitz nature of ep< that 

F(0(n)) = F(ep,(np,)gp,(nYT) +0(\n~ n \/N), 

and thus by dividing P' into sufficiently small (but still dense) sub-products, we 
may assume that 

\E nEP ,F(s P ,(np,)gp,(n)*r)\ » 1 
for some np< £ P' , which by the total equidistribution of gp> implies that 



F(ep,(npi)x) d(j, G /r 

,/T pl 



> 1. 



As F has vertical frequency ry, this implies that 77 must annihilate Gp>^d, and so 
F is invariant with respect to the action of this group. By quotienting out by this 



112 



BEN GREEN, TERENCE TAO, AND TAMAR ZIEGLER 



central group we may thus assume that Gpi^d is trivial, thus Gp>/Tpr now has 
degree < d. We can then write 

~ TL 

x (n) = F(gp,(n)*r P ,,— modi) 
oiv 

for all n G [[N]], where F : *(G pl /T pl x T) is defined so that 

~ Tl 

F(x,—)=F(e P ,(n)x) 

for n G [N] and x G *(G P > /Tp>), and extended in a Lipschitz function to all of 
*(Gp> /Tp> x T). This represents \ as a nilsequence of degree < d on f". Using the 
conjugate nature of the various sequences gp in Theorem ID. 5 [ we conclude that x 
can also be represented as a nilsequence of degree < d on all translates P' + h of P' 
also. On the other hand, since P' is dense in [[-/V]] fc , one can partition 1 = 
on [[-/V]] fc , where J is bounded and the are degree ^ 1 nilsequences, each of which 
is supported on a translate P' + hj of P'. This implies that x — J2j=i *PjX- As 
d 2, the have degree < d, and the claim now follows from Corollary IE. 21 □ 

We have the following useful consequence of Lemma IE. Ill 

Corollary E.12 (Extrapolation lemma). Let k,d G N + with d ^ 2, let x be a 

degree d nilcharacter on *Z fc (with the degree filtration) , and let N be an unbounded 
limit natural number. Let Pi,...,Pk be dense subprogressions of [[N]], and let 
P := Pi x . . . x Pfc. Then the following are equivalent: 

• X is < d-biased on [[N]] k . 

• X is < d-biased on P. 

• [x]e<*([[jv]]*0 = °- 

• Hh«(p) = 0. 

Proof. We trivially have that that (iii) implies (iv). Since x correlates with itself, 
we see that (iii) implies (i) and (iv) implies (ii). Lemma fE. 1 II gives that (i) or (ii) 
both imply (iii), and the claim follows. □ 

The Pontragyin dual T of the integers Z of course contains plenty of torsion. It 
turns out however that this torsion is a purely degree 1 phenomenon, and disappears 
in higher degree. 

Lemma E.13 (Torsion-free lemma). Let k G N + , let N be an unbounded integer, 
and let d ^ 2 be standard. Then the abelian group Symb d ([[7V]] fe ) (with the degree 
filtration) is torsion-free. 

Proof. Our goal is to show that if q 1 is bounded and x is a degree $S s nilcharacter 
such that x® 9 is equal to a degree < s nilsequence on [N] k , then x is also equal to 
a degree < s nilsequence. 

We modify the arguments used to prove Lemma rE.llI We write x — FoO where 
G/T is a degree s nilmanifold, O G *poly(Z' £ -» G/T), and F G Up(*(G/T)) has 
a vertical frequency rj, then we have 

\E ne[N]k F(O(n))®*F Q (O (n))\ » 1 

for some degree < s nilmanifold Go/r , some O G *poly(Z fc — > Go/r ), and F G 
Lip(*(Go/r )). Using Theorem lD.5l we may thus find a product P = Pi x . . . X Pk 
of progressions in [[-/V]] fc and a factorisation 

(O(n),O (n)) = (ep(n)gp(n)*T,e Pt0 (n)gp, (n)*T ) 
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where ep G *poly(Z fc — > G), £p t o G *poly(Z fc — »■ Go) are bounded and Lipschitz on 
[[N]] k with Lipschitz constant 0(1/ N), and (gp,gp,o) G *poly(Z fc — > Gp) is totally 
cquidistributcd in Gp/Tp for some rational subgroup Gp of G x Go. Shrinking P 
if necessary as in the proof of Lemma IE. Ill we may assume that 




F(e P (np)x)® q F (ep fi (np)x ) dng p ft /p (x,z )\ » 1 



for any np G P. From the vertical character nature of F, this implies that r/ q 
annihilates (Gp) s . But 77 is a continuous homomorphism on the connected abelian 
Lie group (Gp) Sl and so rj itself must also annihilate (Gp) s . If we then quotient 
by this space, we can represent y by a degree < s nilsequence on P, and the claim 
now follows from Corollary IE. 121 □ 



Appendix F. A linearisation result from additive combinatorics 

In this appendix, we record a lemma from additive combinatorics (essentially in 
[16] or [21], and in the spirit of Freiman's inverse sumset theorem) which asserts 
that functions from a large subset of [-N, N] to T with a large amount of additive 
structure are essentially bracket-linear in nature. 

Lemma F.l (Linearisation lemma). Let e > be a limit real, let N be a limit 
natural number, let H be a dense subset of [[N]], let a G *T be a frequency, and let 
CiiC2i^3iC4 : H — > *T be limit functions such that 

Zi(hi) + Uh2) + b(h 3 )+U(h4)=a + 0(e) (F.l) 

for many additive quadruples (hi, hi, h 3 , /14) G H. Then there exists a standard 
k ^ 0, a frequency 5 G *T, a dense subset H' of H , and a Freiman homomorphism 
£ : H' -> *T of the form 

K 

i(h) = J2{^kh}fJ k mod 1 
fc=i 

for all h G H' and some G *T and /3k G *K and some standard K , such that 

Zi(h) = t(h) + S + 0(e) (F.2) 

for many h G H . 

Proof. We may replace e by 1/M for some limit integer M. By rounding each £j(/i) 
to the nearest multiple of 1/M , we may assume that is a multiple of 1/M for 
all h G H and i = 1, 2, 3,4. There are now only a bounded number of possibilities 
for the right-hand side a + 0(e), so by the pigeonhole principle (and by redefining 
a if necessary) we may assume that 

Zi(hi) + &(h 2 ) + &(h 3 ) + &(hi) =a 

for many additive quadruples (hi, hi, h 3 , hi) in H . 

For each i = 1,2, 3, 4, let T C *Zx*Tbethe (limit) graph := {(h,&(h) mod 1) : 
h G H}. Then by the preceding discussion, we see that (0, a) has » iV~ 3 represen- 
tations of the form 71 + 72 + 73 + 74, where 7$ G r, for i — 1,2, 3, 4. On the other 
hand, from several applications of the Cauchy-Schwarz inequality, the number of 
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such quadruples is bounded by Yli=i E{Ti) 1 ^, where E(Ti) is the number of ad- 
ditive quadruples in Ti (i.e. the additive energy of r,). Since we have the trivial 
upper bound E(Ti) <C N 3 for all i, we conclude that 

E(r x ) > A^ 3 . 

At this point we invoke some standard additive combinatorial machinery from 
[2"T] (see also [TBI IHI])- Applying the Balog-Szemeredi-Gowers lemma followed by 
the Pliinnecke-Ruzsa inequalities exactly as in [2TJ Proposition 5.4], we can find a 
dense subset V of Ti such that |9r' - 8r'| < N. Applying [2TJ Lemma 9.2], we 
can refine to a further dense subset T" :— {(h,^(h) mod 1) : h € H"} such that 
4r" — 4r" is a graph; thus there exists a Freiman homomorphisnF" 3 ] £ : 2H" — 2H" — > 
T such that 

Si (hi) + £i(fca) - Si (hs) - Zi(hi) = C(fti + ^2 - /i 3 - fed) (F.3) 

for all /ii, /i2, /13, hi € ff". By the Bogulybov lemma (see (2TJ Lemma 6.3]), 2ff" — 
27?" contains a dense regular Bohr set B of bounded rank (see [21] for definitions; 
strictly speaking, one has to identify an interval such as [[10AT]] with Z/2CWZ in 
order to apply these tools, but this is not difficult to do). Arguing aO in [21"1 
Proposition 10.8], we see that we may write 

k 

((h) = 2_.{ a jh}/3j mod 1 
j=i 

for h G B for some standard k and frequencies a.j , /3j . Applying ()F.3|) and the 
pigeonhole principle, we obtain the claim, except possibly for the claim that £ 
is a Freiman homomorphism. But observe that if we restrict the fractional part 
of {ctjh} to a sub- interval of Iq of length at most 1/10 (say) then we obtain the 
Freiman homomorphism property automatically; so the claim follows from one final 
application of the pigeonhole principle. □ 
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